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ABSTRACT 


Real-time  interactive  video  applications,  such  as  video  teleconferencing,  present 
difficult  challenges  to  network  designers  due  to  strict  quality  of  service  constraints  and 
the  limitations  of  traditional  video  compression  schemes.  These  limitations  reveal 
themselves  notably  in  two  areas:  poor  error  robustness  and  a  lack  of  flexibility  when 
dealing  with  multicast  scenarios  over  heterogeneous  networks. 

A  more  promising  approach  that  improves  error  robustness  while  also  offering  a 
solution  to  the  network  heterogeneity  problem  is  to  employ  a  layered  video  codec.  This 
thesis  presents  the  implementation  of  a  new  layered  video  codec  scheme.  Block  updating 
coupled  with  an  aging  algorithm  is  used  in  this  scheme  to  select  macroblocks  for 
transmission.  Block  updating  selects  macroblocks  that  have  changed  due  to  scene 
motion,  and  the  aging  algorithm  ensures  that  an  entire  frame  is  transmitted  within  a  set 
time  interval.  Layering  is  accomplished  through  application  of  the  fast  Haar  transform 
and/or  the  discrete  cosine  transform.  Layer  assignments  are  made  by  grouping  bands  of 
coefficients  with  similar  variances.  Quantization  and  encoding  for  motion  video  employs 
both  an  industry  standard  and  uniform  quantization  with  a  custom  variable  length  coding 
table.  For  static  slides,  uniform  quantization  and  a  second  custom  variable  length  coding 
table  are  employed.  Rate  control  is  accomplished  via  the  reduction  of  a  four-dimensional 
operational  distortion  surface  to  a  one-dimensional  optimal  curve  implemented  as  a 
simple  table  lookup  of  quantizers. 
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INTRODUCTION 


A.  BACKGROUND 

Video  teleconferencing  (VTC)  is  expected  to  contribute  significantly  to  the  US 
Navy’s  Information  Technology  for  the  21®‘  Century  (11-21)  initiative.  IT-21  seeks  to 
change  the  paradigm  to  warfighting  from  a  platform-centric  approach  to  a  network- 
centric  approach,  where  information  superiority  is  leveraged  with  smart  weapons  in  order 
to  achieve  the  desired  result  more  effectively.  The  goal  of  IT-21  is  to  link  all  US  Forces 
together  in  a  network  that  enables  transmission  of  voice,  video,  and  data  from  individual 
workstations  seamlessly  to  both  local  and  remote  users  [1][2].  Employing  VTC  over  a 
tactical  network  at  the  battle  group  level  caters  to  several  useful  applications,  such  as 
collaborative  planning,  distance  learning,  remote  maintenance  and  telemedicine. 

1.  Multimedia  Communication  and  Tactical  Video  Teleconference 
(VTC) 

In  general,  multimedia  communications  are  either  unicast  or  multicast.  Unicast 
represents  peer-to-peer  communications  while  multicast  represents  m  to  n 
communications  where  m  ranges  from  1  to  n.  Unicast  includes  any  client-server 
applications,  such  as  video  on  demand  (VOD)  or  IP  telephony.  Multicast  examples 
include  distance  learning  and  remote  conferencing.  The  tactical  VTC  scenario 
considered  here  is  inherently  a  multicast  application  running  over  a  heterogeneous, 
wireless  network.  Here,  “heterogeneous”  implies  that  a  connection  traverses  a  series  of 
links  (each  imposing  potentially  different  bandwidths),  and  the  recipient  workstations 
differ  in  capability  and  cj^acity;  “wireless”  implies  an  internetwork  of  wireline  local  area 
networks  (LANs)  connected  by  at  least  one  wireless  channel.  As  such,  each  transmitter 
transmits  to  multiple  receivers  in  the  multicast  group.  The  multicast  group  consists  of 
some  combination  of  active  participants  that  are  allowed  to  transmit  and  passive 
participants  that  only  receive.  For  example,  the  problem  of  transmitting  multicast. 
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multimedia  traffic  over  ATM  networks  was  examined  in  [3].  This  situation  is  illustrated 
in  Figure  I-l. 


While  the  network  described  here  affords  large  bandwidth  within  each  shipboard 
LAN,  VTC  realizes  its  true  value  only  if  implemented  within  the  entire  battlegroup.  It  is 
because  the  wireless  channel  constrains  both  the  bit  rate  and  the  bit  error  rate  (each  of 
which,  in  turn,  impacts  the  robustness  of  the  VTC  application)  that  the  tactical  VTC 
network  presents  challenges  not  encountered  with  wireline  networks  and  provides  a 
hostile  environment  for  traditional  video  codecs. 

2.  Video  Compression  and  Robustness 

Since  video  and  audio,  to  a  lessor  degree,  are  bandwidth  intensive,  the  signals 
must  undergo  video  coding  prior  to  transmission,  trading  a  reduction  in  bandwidth  for  a 
reduction  in  quality.  Because  the  human  visual  system  (HVS)  places  greater  relative 
importance  on  lower  frequencies  than  higher  frequencies,  typical  motion  video  is 
perceived  as  primarily  lowpass  [4].  Therefore,  lossless  two-dimensional  (2-D)  transform 
methods  are  used  to  create  a  frequency  domain  representation  of  an  image.  The  HVS 
perceptive  properties  can  then  be  exploited  by  quantizing  the  resulting  coefficients  to 
varying  degrees  of  precision  with  higher  precision  allotted  to  the  lower  frequency 
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coefficients.  Quantization  reduces  the  dynamic  range  of  the  coefficients  and  loses 
information,  but  it  enables  fewer  bits  to  represent  the  coefficients.  Usually,  many  of  the 
higher  frequency  coefficients  are  zeroed  out  via  this  quantizing  process;  mns  of  zeroes 
are  created.  Because  zeros  need  not  be  explicitly  represented,  mn-length  encoding  (RLE) 
is  utilized  to  generate  a  more  compact  representation  of  the  quantized  coefficients,  which 
is  subsequently  replaced  by  a  more  efficient  lossless  variable  length  code  (VLC).  These 
techniques  are  collectively  called  spatial  compression.  Spatial  compression  is  the  basis  • 
of  image  compression  standards  like  that  of  the  Joint  Photographic  Experts  Group 
(JPEG). 

A  video  codec  may  simply  treat  each  frame  of  a  video  as  a  separate  still  image 
and  subject  it  to  spatial  compression  independently  from  the  other  frames.  This  approach 
is  known  as  intraframe  coding.  An  example  is  Motion- JPEG  (M-JPEG).  hitraframe 
coding  has  the  advantage  of  superior  error  resilience  since  decode  errors  are  confined 
strictly  to  the  current  frame.  Its  disadvantage  is  that  compression  gain  with  acceptable 
image  quality  is  limited  to  approximately  0.5  bits  per  pixel  (bpp)  [4]. 

For  a  given  quality,  higher  compression  gains  are  realized  if  the  video  codec 
exploits  the  high  degree  of  correlation  that  video  frames  tend  to  exhibit  on  a  frame-to- 
frame  basis.  This  is  called  interframe  coding,  and  it  eliminates  redundancy  by  coding 
only  the  differences  in  successive  frames.  The  compression  gains  achieved  by  interframe 
coding  vary  in  relation  to  the  degree  and  type  of  motion  that  occurred  between  successive 
frames.  Nearly  static-content  frames  exhibit  high  correlation  and  result  in  high 
compression  gain  while  highly  dynamic-content  frames  have  little  correlation  and  result 
in  less  compression  gain.  If  two  successive  frames  have  no  correlation,  perhaps  due  to  a 
scene  change,  interframe  coding  performs  no  better  than  intraframe  coding  and  typically 
performs  woree  due  to  the  overhead  required  to  track  motion.  The  disadvantage  to 
interframe  coding  results  from  the  dependency  between  frames  at  the  decoder,  errors 
occur  in  the  current  frame,  the  errors  tend  to  propagate  among  frames  as  well  as  spatially 
within  frames  [5]  [6].  Consequently,  video  codecs  like  that  of  the  Moving  Pictures 
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Experts  Group  (MPEG)  incorporate  both  types  of  compression  techniques  in  an  attempt 
to  increase  compression  gain  and  bound  error  propagation. 

3.  Traditional  Video  Codecs 

Low  bit  rate  video  coding  standards  such  as  H.261  and  H.263  perform  best  in 
homogeneous,  unicast  environments.  The  video  server  negotiates  a  desired  Quality  of 
Service  (QoS)  consistent  with  the  desired  video  quality  and  available  bandwidth  prior  to 
delivery.  Since  network  conditions  are  rarely  static,  the  received  video  quality  typically 
changes  due  to  dropped  or  incomplete  frames  caused  by  losses  within  the  network.  With 
the  implementation  of  a  closed  loop  control  scheme  via  feedback  reports  from  the 
recipient,  the  server  can  react  to  the  changing  network  conditions  and  adjust  the 
quantization,  frame  size,  or  frame  rate  in  order  to  vary  the  bit  rate. 

However,  when  traditional  video  codecs  are  applied  in  a  wireless,  multicast, 
heterogeneous  environment,  shortcomings  are  revealed.  First,  the  traditional  scheme 
relies  on  guaranteed  bandwidth  for  delivery,  trading  bandwidth  for  quality.  Selecting  an 
appropriate  quality  (and  therefore  required  bandwidth)  poses  a  dilemma  in  a 
heterogeneous  network.  Since  each  user  is  reached  by  a  different  path  on  the  network, 
each  experiences  different  levels  of  congestion.  Even  with  feedback,  the  controllable 
application  is  faced  with  a  quandary  in  determining  how  to  make  adjustments.  Sending 
higji  quality,  high  bandwidth  video  supports  some  users  but  leaves  low  bandwidth 
recipients  with  degraded  video  due  to  high  packet  loss.  If  the  lowest  common 
denominator  is  supported  instead,  all  recipients  are  forced  to  view  lower  quality  video, 
and  the  high  bandwidth  links  are  underutilized.  Clearly,  meeting  the  varied  expectations 
with  a  single  video  stream  is  impractical  and  transmitting  multiple  video  streams  with 
gradations  in  quality  demands  a  much  greater  bandwidth  expense. 

Second,  the  poor  error  robustness  demonstrated  by  traditional  low  bit  rate  video 
coders  is  especially  troublesome  since  retransmission  is  not  practical  in  a  real-time 
application.  Finally,  feedback  itself  is  undesirable  in  a  low  bit  rate  network.  Feedback 
employed  with  the  goal  of  mitigating  congestion  actually  consumes  available  bandwidth, 
causes  an  additional  load  on  constrained  nodes,  and  increases  congestion  further. 
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4.  Receiver>Based  Layered  Multicast  (RLM) 

A  promising  approach  that  offers  a  solution  to  the  network  heterogeneity  problem 
while  offering  some  improvement  in  error  robustness  is  the  receiver-based  layered 
multicast  (RLM)  scheme  proposed  by  McCanne  et  al.  [7].  RLM  employs  a  layered  video 
codec  that  transmits  video  in  scalable  layers  that  progressively  refine  quality.  An 
independently  decodable  base  layer  is  generated  that  guarantees  a  minimum  acceptable 
quality.  Separate  enhancement  layers  increase  quality  in  a  hierarchical  manner.  With 
RLM,  each  recipient  can  decode  just  the  base  layer  for  low,  but  acceptable,  quality  or  add 
one  or  more  enhancement  layers  to  improve  quality  as  bandwidth  and  hardware  permit. 
This  idea  is  illustrated  in  Figure  1-2. 


Layered  Decoder 


Lowest 

Quality 

Video 


Higher 

Quality 

Video 


RLM  represents  a  starting  point  for  designing  an  integrated  approach  to 
improving  robustness.  The  layered  structure  slightly  reduces  the  effects  of  congestion 
because  a  particular  node  only  needs  to  carry  subscribed  layers.  Unsubcribed  layers  can 
be  dropped.  Figure  1-3  illustrates  this  concept.  Furthermore,  earlier  work  by  Rhee  and 
Gibson  indicates  that  layered  video  exhibits  improved  resilience  to  bit  errors  introduced 
during  transmission  because  spreading  bit  errors  across  multiple  layers  has  less  negative 
impact  on  the  reconstructed  video  [8] . 
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Figure  1-3:  Adapting  to  Network  Heterogeneity  Using  RLM. 

B.  THESIS  OBJECTIVE 

The  main  objective  of  this  thesis  is  to  implement  in  Matlab’  the  new  layered  video 
coding  scheme  for  a  tactical  VTC  proposed  by  Parker  [9]  [10]  for  use  in  multicast, 
hetergeneous,  wireless,  asynchronous  transfer  mode  (ATM)  networks.  The  tactical  VTC 
session  is  assumed  to  consist  of  low-motion  video,  such  as  a  “talking  head”  in  the  style  of 
a  t3T)ical  newscast  and  static  displays  in  the  style  of  an  overhead  presentation.  No 
assumption  is  made  about  the  relative  proportions  of  the  two  types  of  content;  the  session 
could  be  all  low-motion  video,  a  sequence  of  static  displays,  or  any  combination  of  the 
two  types  in  any  order. 

The  complete  specifications  of  the  coder  are  detailed  in  Table  I-l  [9].  For  the 
implementation  in  this  thesis,  a  color  depth  of  8-bit  grayscale  is  evaluated,  but  the 
technique  can  be  expanded  to  include  4:2:0  sub-sampled  24-bit  color.  Audio 
compression  to  8  kbps  or  less  may  be  accomplished  via  code  excited  linear  prediction 
(CELP)  speech  coding,  but  it  is  not  addressed  further  in  this  work.  Error  robustness  is 
provided  via  a  block  updating  scheme  that  limits  the  impact  of  decoder  errors.  Layering 
is  accomplished  by  judiciously  grouping  the  frequency  domain  content  obtained  from  the 
fast  Haar  transform  (FHT)  and/or  the  discrete  cosine  transform  (DCT);  the  exact  method 
of  frequency  decomposition  depends  on  the  video  content.  Handling  both  motion  video 

'  Matlab  is  a  registered  trademark  of  The  MathWorks,  Inc. 
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and  static  slides  with  a  single  coder  requires  significant  flexibility  and  compromise  since 
the  frequency  characteristics  of  each  are  different.  Therefore,  the  coder  is  optimized  to 
handle  each  type  of  content  separately  -  separate  layering  and  separate,  custom  VLC 
tables.  Finally,  the  bit  rate  control  issue  is  examined,  and  an  approach  that  reduces  an  n- 
dimensional  rate  control  problem  to  a  simple  table  lookup  is  implemented. 


VTCStreai 

Valued-:';'  ::V? 

Video 

Bandwidth 

64-96  kbps 

Resolution 

176x144  (QCIF) 

Frame  Rate 

lOfps 

Color  Depth 

8-bit  gray/4:2:0  24-bit  color 

Audio 

BandwidA 

<  8  kbps 

Table  I>1:  Tactical  VTC  Multimedia  Requirements. 


C.  THESIS  ORGANIZATION 

Chapter  II  considers  techniques  for  coding  video,  A  general  discussion  of  video 
stream  structural  hierarchy,  transform  coding  methods,  quantization  techniques,  entropy 
encoding,  frame  coding,  and  quality  measurement  is  presented.  Chapter  IE  presents  the 
specific  techniques  utilized  in  coding  both  motion  video  and  static  slides.  Chapter  IV 
presents  results.  Conclusions  and  recommendations  for  future  study  are  given  in  Chapter 
V.  Appendix  A  and  Appendix  B  provide  the  custom  VLC  tables  used  with  motion  video 
and  static  slides,  respectively.  Appendix  C  is  a  Matlab  code  library  of  the  layered  video 
coder  implementation. 
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n.  LAYERED  VIDEO  CODER  DESIGN  CONSIDERATIONS 


The  basic  components  of  a  video  coder  are  shown  in  Figure  II- 1.  This  chapter 
begins  with  an  explanation  of  video  stream  hierarchy  and  then  discusses  each  of  the 
component  parts  displayed  in  Figure  11- 1.  Motion  compensation  is  present  only  if  the 
coder  attempts  to  exploit  frame-to-frame  correlation.  Otherwise  each  frame  is  processed 
independently.  The  chapter  concludes  by  addressing  a  method  for  quantifying  the 
amount  of  distortion  introduced  by  the  processing. 


Compressed 

Video 

Stream 


Figure  II-l:  Basic  Components  of  a  Video  Coder. 


A.  VIDEO  STREAM  STRUCTURAL  HIERARCHY 

A  video  stream  is  organized  into  a  hierarchy  of  logical  components.  Although  the 
specific  organizational  scheme  varies  with  the  particular  coder  under  consideration,  some 
of  the  most  common  components  are  the  following. 

A  frame  or  picture  is  the  basic  display  unit.  It  is  a  sampled  version  of  the  original 
scene  taken  at  a  particular  instant  in  time.  Each  frame  is  composed  of  a  rectangular  array 
of  pixels  or  points  within  the  frame.  Each  pixel  contains  a  data  structure  indicating  its 
luminosity  or  color.  The  dimensions  of  the  array  define  the  picture  resolution,  given  as 
columns  x  rows,  such  as  the  ITU-T  defined  QCIF  resolution  of  176x144.  Pixels  within  a 
frame  are  organized,  in  order  of  increasing  size,  into  blocks,  macroblocks,  and  groups  of 
macroblocks  (GOBs).  A  block  is  an  8x8  array  of  pixels;  a  macroblock  is  a  16x16  array 
of  pixels  or,  equivalently,  four  blocks.  A  frame  may  be  considered  as  rows  of 
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macroblocks.  One  or  more  contiguous  rows  of  macroblocks  are  termed  a  GOB.  This 
hierarchy  is  illustrated  in  Figure  11-2. 


Figure  11-2:  Organizational  Hierarchy  for  Compressed  Video. 


B.  TRANSFORM  CODING 

Although  compression  through  direct  scalar  quantization  of  pixel  values  is 
possible,  it  is  inefficient.  An  alternative  approach  is  to  use  transform  coding.  Because 
contiguous  pixels  within  a  frame  tend  to  be  highly  correlated,  the  application  of  a  suitable 
linear  transform^  to  decorrelate  the  pixels  yields  two  primary  advantages.  The  first  is  a 
property  termed  “energy  packing”  efficiency.  The  second  is  that  the  resulting 
coefficients  are  more  conducive  to  perceptual-based  quantization  schemes. 

^  The  transform  should  be  lossless  and  invertable.  Lossless  means  that  no  information  is  lost  through 
application  of  the  transform.  Invertable  means  that  the  original  information  is  recoverable. 


A  signal  is  decorrelated  if  the  application  of  a  transform  causes  the  signal’s 
autocorrelation  matrix  to  become  diagonal;  that  is,  it  uncorrelates  the  resulting 
coefficients.  The  optimal  transform  tightly  packs  all  the  energy  (information)  into  the 
minimum  number  of  coefficients  possible,  resulting  in  the  highest  energy  packing 
efficiency.  Arranging  these  N  coefficients  in  decreasing  order  of  magnitude  and  retaining 
only  the  first  k  coefficients  gives  the  least  distortion  as  measured  by  mean  squared  error 
(MSE)  compared  to  any  other  set  of  k  coefficients.  Similarly,  a  given  level  of 
quantization  of  the  decorrelated  coefficients  results  in  the  least  distortion  of  the  original 
data  [1 1].  Because  certain  transform  coefficients  may  hold  greater  perceptual  relevance 
by  the  HVS,  this  dependency  can  be  exploited  by  utilizing  a  frequency-based  transform 
and  then  quantizing-  a  lossy  process  -  with  a  step  size  proportional  to  the  perceptual 
importance  of  each  coefficient. 

1.  Discrete  Cosine  Transform  (DCT) 

Theoretically,  the  discrete-time  Karhunen-Loeve  transform  (KLT)  provides  the 
greatest  energy  packing  efficiency  [11].  However,  two  liabilities  preclude  the  use  of  the 
KLT  in  video  compression.  First,  the  KLT  is  extremely  computationally  intensive  - 
requiring  order  operations.  Second,  the  KLT  is  signal  dependent  -  requiring  a 
separate  eigenvector  calculation  for  each  transformed  data  block.  Instead,  transforms  that 
approach  the  energy  packing  efficiency  of  the  KLT  and  possess  more  efficient  algorithms 
are  utilized  in  video  coders. 

A  widely  used  transform  for  image  processing  is  the  two-dimensional  (2-D) 
discrete  cosine  transform  (DCT).  The  2-D  DCT  provides  energy  packing  performance 
very  close  to  that  of  the  KLT  and  can  be  implemented  with  fast  algorithms  that  reduce  the 
computational  effort  to  the  order  of  Mog2N  [4].  A  firame  is  transformed  by  partitioning 
the  frame  into  NxN  regions  of  pixels  and  applying  the  2-D  DCT  to  each  individual 
region.  N  can  be  any  integer  provided  that  integer  multiples  of  N  equals  the  overall 
dimensions  of  the  frame.  Because  the  correlation  among  contiguous  pixels  tends  to 
decrease  with  increasing  size  of  the  iVxA^  region  which,  in  turn,  decreases  the  resulting 
compression  gain,  the  typical  NxN  size  used  with  the  2-D  DCT  is  8x8  (a  block). 
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Denoting  the  original  block  as/fij)  and  the  coefficient  block  by  F(u,v),  the  2-D  DCT  is 
given  by  [4] 


F(m,  v)  =  ~  Ciu)Civ)Y^f{i,j)cos 


i=0  j=0 


(2i+l)^'\  f  (2  /  +  l);iv 


where 


—  jc  =  0 

cw  =  \S 

1,  Otherwise. 


The  inverse  DCT  is  given  by 


/(M')=-|||c(«)C(v)cos(<2i^Wp 


+  l)m> 
2N 


(H-l) 


(n-2) 


F(k,v).  (n-3) 


The  result  is  a  block  of  64  coefficients  having  a  spatial  frequency  interpretation  as 
shown  in  Figure  11-3.  The  F(0,0)  coefficient  represents  the  DC  contribution  and  the 
remaining  63  coefficients  represent  the  AC  contribution.  The  different  elements  of  an 
image  map  into  the  frequency  domain  as  shown  in  Figure  11-4  [4].  The  application  of  the 
2-D  DCT  to  an  8x8  block,  where  there  is  typically  little  variation  from  pixel  to  pixel, 
leads  to  a  predominance  of  lowpass  content  in  the  frequency  domain.  Given  this 
condition,  the  magnitudes  of  the  AC  coefficients  are  largest  in  the  region  around  the  DC 
coefficient  and  diminish  with  increasing  spatial  frequency. 
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Increasing  horizontal  frequency 


Figure  II-3:  Frequency  Interpretation  of  DCT  Coefficients. 


Increasing  horizontal  frequency 


Figure  11-4:  Structural  Decomposition  of  Image  Elements  [4]. 

2.  Discrete  Wavelet  Transform  (DWT) 

Wavelet  coding  is  another  technique  for  compressing  still  images  and  has  been 
shown  to  offer  slightly  better  image  quality  than  DCT-based  schemes  for  similar  levels  of 
compression  at  the  cost  of  greater  computational  complexity  [12].  Applying  a  wavelet 
transform  to  an  image  transforms  that  image  into  a  “multifrequency  component 
representation,”  where  each  component  has  its  own  frequency  characteristics  and  spatial 
features.^  A  discrete  wavelet  transform  (DWT)  filters  and  decimates  an  image  into 

^  The  wavelet  transform,  like  the  2-D  DCT,  is  lossless  and  invertable. 
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separate  subbands  of  coefficients  containing  a  mixture  of  the  high  frequency  and  low 
frequency  details.  Image  decomposition  is  accomplished  via  two  analysis  filters.  The 
first  extracts  the  low  frequency  content,  the  signal  average;  the  second  extracts  the  high 
fiequency  content,  the  signal  details. 

The  simplest  example  of  a  DWT  is  the  fast  Haar  transform  (FHT)  [13].  It  is 
described  by  its  signal  average  equation, 

in)  =  |(Xo(2n)  +  Xq  (2n  +  1))  (n-4) 

and  its  signal  detail  equation, 

^?i(«)=^(->fo(2«)-JCo(2»  +  l)).  (n-5) 

where  xq  is  the  original  data  vector,  and  vectors  a\  and  d\  are  the  first  order  analysis 
(decomposition)  vectors.  If  xo  contains  2^  elements,  £q)plying  Equations  II-4  and  IL5 
generates  analysis  vectors  of  length  2^^, 

Referring  to  Figure  11-5,  the  LL  subband  is  calculated  by  applying  the  average 
equation  to  the  columns  and  rows  of  the  image.  The  LL  subband  retains  the  lowpass 
information  from  the  original  image  and  presents  a  coarse  representation  of  the  original 
image.  Because  typical  images  have  lowpass  characteristics,  most  of  the  energy  from  the 
image  is  represented  in  this  subband.  The  HL  subband  is  calculated  by  applying  the 
average  equation  to  the  columns  of  the  image  and  the  detail  equation  to  the  rows.  The 
HL  subband  retains  the  vertical  edge  details  from  the  original  image.  Typically,  less 
energy  from  the  original  image  is  represented  in  the  HL  subband  compared  to  that  of  the 
LL  subband.  The  LH  and  HH  subbands  are  found  analogously  with  LH  retaining 
horizontal  edge  details  and  HH  retaining  the  diagonal  edge  details.  The  HH  subband 
typically  has  the  lowest  energy  content.  In  fact,  in  some  applications  the  HH  subband  is 
discarded  in  its  entirety  [14]. 
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Figure  n-5:  Octave-Based  Wavelet  Decomposition. 


In  order  to  differentiate  the  frequency  cont^t  of  an  image  further,  a  second  DWT 
can  be  applied.  Still  referring  to  Figure  11-5,  the  effect  of  a  second  order  octave-based 
decomposition  obtained  by  applying  the  same  analysis  filters  to  the  LL  subband  is 
illustrated.  This  increase  in  the  number  of  subbands  allows  tailoring  additional  stages  in 
the  coder  to  emphasize  the  perceptually  important  details  over  less  perceptual 
background  noise  as  will  be  discussed  in  greater  detail  later. 

3.  Comparison  of  Transfonns 

Since  the  2-D  DCT  is  applied  to  pixel  blocks  within  a  frame  that  are  subsequently 
quantized,  it  has  a  tendency  to  create  “blocking”  artifacts  that  disturb  the  continuity  of  the 
reconstructed  frame.  The  same  effect  also  leads  to  the  presence  of  “ringing”  artifacts 
around  sharp  edges.  In  contrast,  the  wavelet  transforms  typically  are  applied  to  the  entire 
image  and  separate  the  entire  image  into  regions  of  high  and  low  frequency  content. 
These  regions  may  be  quantized  and  coded  independently  and  result  in  more  efficient  bit 
allocation.  This  produces  a  more  visually  pleasing,  smoother  reconstructed  image 
compared  to  that  obtained  via  a  2-D  DCT  at  a  comparable  peak  signal-to-noise  ratio 
(pSNR).  In  general,  at  comparable  pSNRs,  wavelet  transform  coders  offer  compression 
gains  superior  to  that  of  DCT-based  coders  [12]. 

However,  several  drawbacks  have  limited  the  utility  of  wavelet-based  video 
compression.  Wavelets  achieve  quality  superior  to  DCT-methods  by  processing  the 
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entire  image  or  frame,  but  traditional  video  coders  exploit  temporal  correlation  at  a 
subframe  (usually  macroblock)  level.  Although  the  error  block  could  be  transformed  via 
a  DWT,  no  significant  advantage  has  been  determined  over  the  DCT,  and  the 
computational  effort  is  greater  [12].  However,  the  frequency  decomposition  offered  by 
DWTs  provides  a  powerful  basis  for  layered  video  coding. 

C.  QUANTIZATION  TECHNIQUES 


After  the  transform,  the  coefficients  are  quantized.  Quantization  is  a  lossy 
process  used  to  reduce  precision  and  zero  out  some  coefficient  values.  The  benefit  is  that 
each  coefficient  can  then  be  represented  with  fewer  bits.  However,  the  information  loss 
is  not  recoverable,  and  the  loss  is  manifested  as  distortion  within  the  reconstructed  image. 
This  distortion  is  termed  quantization  noise. 

The  typical  quantization  scheme,  uniform  quantization,  involves  dividing  each 
coefficient  Fuv  by  the  quantizer  step  size  Quv  and  rounding  the  result  to  the  nearest  integer 
to  produce  a  quantized  coefficient  Fqw  as  follows  [15]; 


quv 


=  nearest  integer 

V 


(n-6) 


The  values  used  in  image  reconstruction  are  then  F^«v  multiplied  by  the  step  size. 
However,  as  Equation  11-6  implies,  the  step  value  may  be  adjusted  for  each  particular 
coefficient  with  now  representing  elements  of  a  quantizer  matrix.  Choosing  the 
appropriate  step  size  involves  a  trade-off  between  acceptable  error  and  desired 
compression.  Employing  a  small  step  size  yields  low  quantization  noise  but  little 
compression.  The  opposite  is  the  case  for  a  large  step  size.  Although  uniform 
quantization  is  often  used,  the  choice  is  suboptimal  since  the  individual  coefficients  are 
not  distributed  uniformly  [16].  However,  two  broad  approaches  are  employed  to  refine 
the  selection  of  step  size:  HVS  weighting  and  bit  allocation  strategies.  HVS  weighting 
heuristically  refines  the  step  size  based  on  perceptual  relevance  while  bit  allocation 
strategies  attempt  to  spread  errors  optimally  across  all  coefficients. 
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1. 


Human  Visual  System  (HVS)  Weighting 


Because  the  HVS  places  greater  relative  importance  on  lower  frequencies  than 
higher  frequencies,  step  sizes  based  on  HVS  modeling  are  selected  such  that  lower 
frequencies  coefficients  are  quantized  more  finely  while  higher  frequency  coefficients  are 
quantized  more  coarsely  [4].  The  HVS  is  also  more  acute  to  luminous  intensity  than 
chromatic  intensity.  Therefore,  different  quantizer  matrices  are  developed  for  each.  An 
example  of  a  luminance  quantization  matrix  widely  used  in  JPEG  compression  is  given  in 
Figure  11-6  [4].  The  dimensions  match  the  8x8  block  size  used  with  the  2-D  DCT.  In 
application  of  a  quantizer  matrix,  the  entire  matrix  can  be  scaled  by  a  multiplicative 
constant  in  order  to  scale  quantization  noise  and  compression  while  maintaining  the 
relative  importance  among  the  coefficients. 
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Figure  n-6:  HVS-Based  Luminance  Quantization  Matrix  [4]. 

2.  Bit  Allocation 

Using  a  bit  allocation  approach,  the  value  of  the  step  size  is  chosen  to  minimize 
distortion  within  a  bit  budget.  It  is  a  classical  resource  allocation  problem,  where  the 
fundamental  trade-off  in  quantization  is  between  rate  (number  of  bits)  and  distortion 
(approximation  error)  formalized  as  rate-distortion  theory.  Several  sophisticated 
algorithms  utilize  Lagrange  methods  applied  to  arbitrary  rate-distortion  curves.  Of  these, 
a  popular  approach  is  varying  the  step  size  in  proportion  to  the  variance  of  the  coefficient 


17 


[17].  However,  bit  allocation  schemes,  regardless  of  the  methodology,  do  not  account  for 
the  sensitivity  to  different  spatial  frequency  characteristics  of  human  visual  perception. 

D.  ENTROPY  ENCODING 

Further  compression  can  be  realized  through  a  lossless  process  called  entropy 
encoding  that  removes  redundant  information.  The  simplest  style  of  entropy  encoding  is 
run-length  encoding  (RLE).  With  RLE,  a  data  set  is  parsed  to  locate  sequences  of 
repeated  values.  Any  such  sequence  is  replaced  by  a  codeword  consisting  of  a  delimiter, 
the  value,  and  the  number  of  times  the  value  is  repeated.  The  longer  the  sequence  of  the 
repeated  value,  the  greater  the  available  compression.  Following  quantization,  many 
elements  of  a  coefficient  block  typically  have  a  value  of  zero.  Often  many  high 
frequency  rows  and  columns  are  completely  composed  of  zeros.  Accordingly,  it  is 
advantageous  to  scan  the  coefficient  block  in  a  manner  as  to  produce  the  longest  runs  of 
zeros.  In  JPEG  compression,  scanning  the  quantized  coefficient  block  as  a  vector  in 
zigzag  fashion  starting  with  the  DC  coefficient  down  to  the  F8,8  coefficient  has  been 
shown  to  increase  the  run-length  of  zeros  [4].  Since  the  repeated  value  here  is  known  (0), 
an  adaptation  of  the  basic  codeword  scheme  cited  above  consists  of  the  run-length  of 
zeros  followed  by  the  magnitude  of  the  next  non-zero  value.  If  there  is  no  remaining 
non-zero  value  in  the  block,  a  special  end-of-block  (EQB)  codeword  is  used  instead. 

After  RLE,  the  quantized  coefficient  block  is  represented  by  a  set  of  codewords, 
where  each  codeword  represents  a  symbol  drawn  from  a  larger  source  alphabet.  Variable- 
length  coding  (VLC)  minimizes  the  average  codeword  length  by  assigning  shorter 
codewords  to  the  most  probable  symbols  and  longer  codewords  to  the  least  likely 
occurring  symbols  while  maintaining  each  uniquely  decipherable  (UD).  Huffman  coding 
is  the  most  widely  used  entropy-encoding  algorithm  and  is  guaranteed  to  produce  a 
minimum  average  length,  UD  code  [15].  The  Huffman  algorithm  uses  each  symbol’s 
probability  of  occurrence  and  builds  a  prefix  code  using  an  optimum  binary-branching 
tree.  Since  both  the  coder  and  decoder  need  to  use  the  same  coder  and  generating  a 
Hufftnan  table  is  computationally  expensive,  standard  tables  are  normally  pre-defined 
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using  data  drawn  from  test  images.  Optimal  compression  is  no  longer  guaranteed,  but 
encoding  and  decoding  are  faster,  and  the  need  to  transmit  the  VLC  table  is  avoided. 

E.  FRAME  CODING 

As  discussed  previously,  intraframe  coding  compresses  each  frame  of  a  video  as  a 
separate,  still  image.  Its  advantage  is  superior  error  resilience;  its  disadvantage  is  limited 
compression  gain.  Interframe  coding  exploits  frame-to-frame  correlation  by  coding  only 
the  difference  in  successive  frames,  thereby  allowing  the  potential  for  higher  compression 
gain  than  that  achieved  by  intraframe  coding  alone.  The  disadvantage  to  interframe 
coding  is  in  robustness  since  decode  errors  can  propagate  between  frames  and  spatially 
within  a  frame  depending  on  the  algorithm. 

Because  of  the  compression  required  for  the  target  VTC  scenario,  strict  intraframe 
coding  is  not  an  option.  Therefore,  a  robust  algorithm  for  interframe  coding  is  needed. 
Several  source-coding  techniques  used  to  exploit  temporal  redundancy,  such  as  motion 
compensation,  differential  pulse  code  modulation  (DPCM)  and  block  updating,  are 
discussed  in  [4].  For  a  robust  application  in  a  real-time,  heterogeneous,  multicast 
environment,  block  updating  was  judged  the  most  promising  and  is  discussed  further. 

Interframe  coding  via  block'^  updating  is  a  variation  on  intraframe  coding.  With 
block  updating,  each  block  in  the  current  frame  is  compared  to  the  corresponding  block 
in  the  previous  frame,  and  a  distance  metric  is  calculated.  If  the  metric  is  above  a 
threshold  value,  that  block  is  intracoded  and  transmitted.  Otherwise,  no  further 
consideration  is  given  to  that  block;  it  is  skipped.  Thus,  block  updating  conserves 
bandwidth  by  coding  and  transmitting  only  those  blocks  that  have  changed  perceptually 
since  the  previous  frame  [18]-  hi  low-motion  video,  such  as  the  “talking  head”  scenario, 
motion  is  confined  to  a  relatively  small  region  within  a  frame  and  the  background 
remains  static.  A  significant  bandwidth  savings  is  possible. 


*  In  this  context,  “block”  implies  an  AWV  pixel  region,  not  necessarily  8x8. 
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In  order  for  block  updating  to  be  effective,  a  suitable  distance  metric  must  be 
defined.  Common  distance  metrics  employed  are  MSB,  sum  of  absolute  differences 
(SAD),  and  absolute  sum  of  differences  (ASD)  [4]  [14]. 

Utilizing  block  updating  alone  for  a  tactical  VTC,  where  end-users  may  join  a 
session  that  is  already  in  progress,  has  a  liability.  End-users  joining  late  will  never 
receive  a  block  that  does  not  exceed  the  selection  threshold,  thus  leaving  them  with  a 
partially  reconstructed  video.  In  order  to  mitigate  this  affect,  block  updating  can  be 
combined  with  an  aging  algorithm  that  periodically  forces  block  updates.  Such  an 
algorithm  guarantees  a  full  scene  reception  within  some  set  interval  [14]. 

F.  MEASURING  QUALITY 


Given  that  video  coders  trade  compression  gain  for  image  quality,  quantifying  the 
level  of  distortion  introduced  is  useful  in  evaluating  different  coding  schemes.  A  useful 
measure  of  image  distortion  D  is  the  MSB  between  the  original  ( jc )  and  reconstructed 
(x)  images  [4]: 


N  M 


D= - 

Using  distortion  D,  the  signal-to-noise  ratio  (SNR)  is  determined  as 


(n-7) 


SM?  =  101og,o 


D 


(n-8) 


where  is  the  variance  of  the  original  image.  The  most  widely  published  measure  of 
image  quality  is  the  peak  signal-to-noise  ratio  given  by 


p5M!=10Iog,„-^, 


01-9) 


where  K  is  the  maximum  peak-to-peak  value  in  the  image,  255  for  the  typical  8-bit 
image.  For  example,  a  typical  peak  SNR  for  a  typical  JPEG  encoded  grayscale  image  is 
28  dB  at  0.5  bpp. 

Using  MSB  as  a  measure  of  image  quality  does  have  drawbacks.  MSB  does  not 
directly  define  perceptual  quality  since  all  errors  are  given  equal  weight.  Two 
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compression  techniques  yielding  the  same  MSE  for  an  image  may  deliver  slight 
differences  in  perceptual  quality. 
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in.  LAYERED  VIDEO  CODECS 


A.  LAYERED  CODING  SCHEMES 

Parker  [9]  discusses  several  previous  approaches  to  layered  video  coding.  They 
can  be  classified  into  three  fundamental  categories:  component-based  layering,  spatial- 
based  layering,  and  frequency-based  layering. 

Component-based  layered  coders  transmit  a  base  layer  and  usually  a  single 
enhancement  layer.  A  traditional  DCT-based  coder  (yielding  some  set  quality)  supplies 
the  base  layer.  The  enhancement  layer  improves  this  quality  by  either  providing 
augmenting  information  to  the  base  layer  or  correcting  distortion  present  within  the  base 
layer.  Two  such  schemes  are  proposed  by  Rhee  and  Gibson  [8]  [20].  Implicit  in  the 
component-based  approach  is  that  the  enhancement  layer  may  duplicate  information 
already  present  in  the  base  layer. 

Spatial-based  layered  coders  partition  a  frame  into  hierarchical  areas  of  interest 
and  encode  each  area  separately.  For  example,  the  “talking  head”  frame  shown  in  Figure 
ni-l  may  be  partitioned  into  two  areas  of  interest  with  the  speaker  being  of  primary 
interest  and  the  remainder  of  the  frame  being  of  secondary  interest.  With  spatial-based 
layering,  relatively  more  bandwidth  would  be  allocated  to  the  region  encompassing  the 
speaker  with  the  balance  of  available  bandwidth  being  apportioned  to  the  rest  of  the 
frame.  Bahl  and  Hsu  [21]  have  proposed  a  coder  that  incorporates  content-sensitive 
spatial  decomposition  and  multiresolution  coding.  The  difficulty  with  implementing  a 
spatial-based  coder  is  creating  the  hierarchical  areas  of  interest  dynamically  while 
minimizing  the  overhead  required  to  identify  their  shifts  relative  to  the  layer  assignments. 
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Figure  III-l:  Sample  “Talking  Head”  Video  Frame. 

Frequency-based  layered  coders  decompose  a  frame  into  subbands  and  then 
arrange  the  subbands  into  individual  layers.  The  frequency  decomposition  may  be 
applied  to  the  entire  frame  or  individual  macroblocks  within  the  frame.  Each  layer  may 
contain  one  or  more  subbands.  The  base  layer  contains  at  least  the  low  frequency 
subband  but  may  also  contain  higher  frequency  subbands  in  order  to  improve  base  layer 
quality.  Figure  111-2  illustrates  this  concept  using  the  DWT  to  decompose  a  frame,  where 
the  LL  subband  alone  constitutes  the  base  layer,  the  LH  and  HL  subbands  are  combined 
into  a  first  enhancement  layer,  and  the  HH  subband  constitutes  the  second  enhancement 
layer. 


Frame 


Base  Layer 


Enhancement 

Layer 

2”^  Enhancement 
Layer 


Figure  ni-2:  Basic  Layered  Video  Coder  Using  the  DWT. 


The  principle  advantages  to  frequency-based  layering  are  its  extendibility  and 
flexibility.  Further  decomposing  the  frame  through  application  of  the  DWT  to  each 
subband  depicted  in  Figure  111-2  creates  a  total  of  16  subbands.  Repeated  application  of 
the  DWT  to  these  subbands  creates  even  more  subbands.  With  a  larger  number  of 
subbands,  there  is  increased  flexibility  in  the  manner  in  which  they  can  be  assigned  to 
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layers.  This,  in  turn,  facilitates  exploiting  each  subband’s  perceptual  importance. 
Additionally,  a  greater  number  of  subbands  permits  a  greater  possible  number  of  layers. 
Three  coders  that  utilize  the  frequency-based  approach  are  discussed  in  [14],  [22],  and 
[23]. 

B.  A  LOW-COMPLEXITY,  ADAPTIVE  CODER  FOR  TACTICAL  VTC 

As  cited  above,  several  diverse  approaches  to  designing  layered  coders  have  been 
proposed;  each  emphasizes  different  network  architectures  or  applications  to  varying 
degrees.  However,  there  is  neither  a  consensus  in  identifying  a  preferred,  structured 
^proach  nor  a  consensus  in  quantifying  those  parameters  that  make  a  layered  coder 
effective.  This  section  presents  the  development  and  implementation  of  a  new  layered 
coder  that  is  specifically  tailored  for  the  tactical  VTC  scenario  where  limited  transmission 
bandwidth  is  available,  and  the  robustness  of  transmission  is  at  a  premium. 

Both  the  characteristics  of  the  tactical  VTC  application  and  the  desire  to  quantify 
effectiveness  guide  the  strategy  for  developing  and  implementing  the  new  layered  coder. 
Specifically,  the  application  yields  five  requirements.  The  coder  must  1)  provide  a  video 
stream  characterized  by  the  bandwidth,  resolution,  frame  rate,  and  color  depth  detailed  in 
Table  1-1,2)  optimize  compression  adaptively  for  both  low  motion  video  and  static 
slides,  3)  provide  a  low  complexity  architecture  to  minimize  coding  delays  and  power 
requirements,  4)  provide  error  resilient  decoding  at  high  packet  loss  rates,  and  5) 
constrain  bit  rate  to  a  predetermined  average. 

The  guiding  factors  stemming  from  the  desire  to  quantify  effectiveness  are 
twofold.  First,  the  coder  must  provide  a  base  layer  with  acceptable  quality  and  two  (or 
more)  enhancement  layers  that  progressively  improve  perceptual  quality.  Second,  the 
coder  must  minimize  the  bitstream  overhead  required  to  accommodate  the  layering 
structure.  A  functional  diagram  of  the  implemented  coder  is  shown  in  Figure  111-3.  Each 
component  is  addressed  in  follow  on  sections. 
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Figure  in>3:  Functional  Block  Diagram  of  Proposed  Layer  Coder. 


1.  Temporal  Compression  via  Block  Updating 

Temporal  compression  is  accomplished  via  block  updating  applied  at  the 
macroblock  level.  Only  those  macroblocks  exhibiting  sufficient  change  on  a  frame-to- 
frame  basis  are  encoded.  Although  block  updating  has  been  shown  to  yield  inferior 
compression  performance  relative  to  motion  prediction  algorithms,  block  updating  was 
chosen  because  the  performance  differential  is  small  when  low  activity  video  is 
considered  [24],  and  greater  robustness  is  realized  because  temporal  error  propagation  is 
greatly  limited  and  spatial  error  propagation  is  eliminated.  Block  updating  also  negates 
the  need  for  the  locally  decoded  reference  frame  that  is  common  to  motion  prediction 
schemes.  This  greatly  simplifies  the  coder  by  eliminating  the  overhead  required  by  the 
performance  of  an  inverse  quantization/transform  at  the  coder.  Block  selection,  as 
considered  here,  is  solely  with  regard  to  motion  video  sequences.  Since  static  slide 
sequences  exhibit  little  or  no  motion,  block  selection  via  motion  detection  is  of  limited 
utility  there.  Indeed,  transmissions  during  static  sequences  result  from  considerations 
presented  in  the  next  section,  where  macroblock  aging  is  discussed. 

Sufficient  change  between  corresponding,  frame-successive  macroblocks  is 
determined  by  calculating  a  distance  metric  between  them  and  comparing  the  result  to  a 
threshold.  The  distance  metric  utilized  is  the  non-normalized  ASD  given  by: 
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ASD  = 


(m-i) 


|m=l  n=l 


where  „  represents  the  pixel  value  within  the  current  block  while  xf  „  represents  the 

pixel  value  in  the  reference  block.  The  primary  reasons  for  choosing  ASD  over  other 
metrics  are  twofold.  First,  the  ASD  is  computationally  efficient  compared  to  MSB  and 
SAD.  The  ASD  employs  only  additions  and  subtractions  and  a  single  absolute  value 
operation.  MSB  requires  expensive  multiplication  operations  -  making  it  ill  suited  to 
real-time  applications.  SAD  requires  the  same  number  of  additions  and  subtractions  as 
ASD,  but  it  requires  MxN-  1  more  absolute  value  operations.  Second,  since  the  ASD 
takes  a  single  absolute  value  of  a  sum,  it  acts  as  an  accumulator  and  provides  a  lowpass 
filtering  effect  that  removes  noise  in  pixel  intensities  introduced  through  video  capture. 
This  smoothing  prevents  the  threshold  from  being  exceeded  spuriously  in  otherwise  static 
regions  of  the  frame  sequence  whereas  the  nonlinear  operations  performed  on  a  per-pixel 
basis  in  MSB  and  SAD  tend  to  accumulate  this  noise  energy  and  yield  spurious  block 
selections.  Thus,  the  ASD  metric  allows  bandwidth  to  be  more  effectively  devoted  to 
regions  of  interest  [14]. 

Two  independent  elements  affect  video  quality  and,  therefore,  the  required  bit 
rate:  adequate  motion  detection  (to  prevent  “jerky”  reconstructed  video)  and  control  of 
distortion  introduced  through  quantization.  The  goal  in  motion  detection  is  to  select  the 
maximum  block  selection  threshold  that  adequately  captures  motion,  hi  the  video 
sequences  examined,  a  threshold  of  160  (for  ASD)  proved  adequate  for  detecting 
perceptual  motion  and  resulted  in  an  average  of  24.8  macroblocks  selected  per  frame. 

Further  considering  the  issue  of  computational  expense,  the  ASD  metric  is  applied 
to  individual  8x8  blocks  within  the  macroblock;  the  first  block  to  exceed  the  threshold 
triggers  macroblock  selection  and  ends  the  search.  This  avoids  the  expense  of  examining 
the  remaining  blocks  of  the  macroblock.  Additionally,  since  the  HVS  acuity  is  more 
sensitive  to  changes  in  luminous  than  chromatic  intensity  [4],  distance  calculations  are 
confined  to  the  luminous  component  of  each  pixel  even  if  chromatic  information  were 
available. 
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Since  motion  in  VTC  scenes  tends  to  be  confined  to  discrete  objects  within  the 
scene  (as  opposed  to  scene  motion  caused  by  a  camera  pan),  search  efficiency  is  slightly 
affected  by  the  order  in  which  the  individual  blocks  are  examined.  The  approach  that 
proved  more  efficient  in  the  test  video  sequences  considered  here  is  to  maximize  the 
distance  between  the  first  two  blocks  examined.  As  shown  in  Figure  1II-4,  two  search 
patterns  were  compared:  a  clockwise  search  starting  from  the  upper  left  block  and  an  X- 
pattem  search  that  examines  the  upper  left  block  followed  by  the  lower  right.  Given  that 
a  macroblock  was  selected  for  transmission,  the  X-pattem  resulted  in  a  2.5%  decrease  in 
the  average  number  of  blocks  examined  per  frame. 


b. 

Figure  111-4  Block  Search  Order;  a.)  Clockwise  and  b.)  X-pattem. 

Still  greater  search  efficiency  is  realized  by  using  the  X-pattem  search  and 
varying  the  starting  block  of  each  macroblock  to  match  the  anticipated  motion  at  that 
point  in  the  frame.  Because  motion  in  VTC  sequences  is  fairly  confined,  macroblocks 
tend  to  be  selected  in  the  same  manner  frame-to-frame.  For  example,  a  speaker  shifts  left 
or  right  and/or  slightly  up  or  down.  Therefore,  search  speed  is  increased  by  having  the 
search  algorithm  remember  the  identity  of  the  specific  block,  termed  the  “anchor”,  which 
caused  a  particular  macroblock  to  be  selected  in  the  previous  frame.  For  the  subsequent 
distance  metric  calculations,  search  begins  at  the  anchor.  If  the  anchor  again  causes 
selection  or  if  the  macroblock  is  not  selected,  the  anchor  identity  is  unchanged.  If 
another  block  causes  selection,  the  anchor  identity  is  updated.  Using  this  search  scheme 
produced  an  additional  20%  reduction  in  the  number  of  blocks  examined  for  selected 
macroblocks.  A  more  complex  approach  not  examined  here  is  to  remember  the  two 
blocks  that  most  frequently  cause  selection  and  tailor  the  search  accordingly. 
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2.  Aging  Algorithm 

Utilizing  this  block  updating  scheme  alone  for  macroblock  refreshment  exposes 
some  undesirable  performance  characteristics.  One  such  consequence  is  termed 
hysteresis.  Consider  an  arbitrary  macroblock  whose  content  is  changing  temporally  due 
to  the  movement  of  an  item  within  its  spatial  bounds.  The  item  travels  from  its  initial 
position  along  some  trajectory  to  its  final  position.  At  some  point  in  the  trajectory,  let  the 
change  in  the  macroblock’s  content  be  sufficient  to  exceed  the  motion-detection 
threshold,  so  the  macroblock  is  selected  for  update.  If  the  final  position  of  the  item  has 
not  yet  been  reached  at  the  time  of  macroblock  selection,  the  item  continues  movement  to 
its  destiny.  Once  the  final  position  is  reached,  hysteresis  occurs  if  the  distance  between 
this  final  position  and  the  item’s  position  at  the  last  macroblock  update  is  insufficient  to  * 
force  another  macroblock  selection;  that  is,  the  distance  is  less  than  the  threshold.  In  this 
case,  the  displayed  macroblock  at  the  receiver  is  left  with  a  persistent  error.  As  another 
illustration  of  hysteresis,  consider  a  frame  sequence  depicting  only  slight  motion 
contained  within  an  arbitrary  macroblock’s  spatial  position,  ff  the  change  between 
successive  macroblocks  as  calculated  by  the  distance  metric  is  below  the  threshold  value, 
the  macroblock  is  not  selected  for  encoding.  If  several  frames  in  the  video  sequence 
continue  to  depict  similar  slight  motion  below  the  threshold,  the  displayed  macroblock  at 
the  receiver  eventually  shows  a  persistent  error. 

Another  problem  is  in  the  duration  the  error  artifacts  (introduced  by  the  channel 
through  dropped  or  corrupted  packets)  are  maintained  in  a  receiver’s  reconstructed  video. 
Error  artifacts  in  the  dynamic  portion  of  a  scene  tend  to  last  only  a  single  frame  because 
block  updates  occur  frequently.  However,  any  such  error  in  a  relatively  static  region 
tends  to  persist  longer  due  to  the  much  lower  frequency  of  block  updating  there. 

A  final  problem  occurs  when  new  participants  are  allowed  to  join  a  VTC  already 
in  progress  (dynamic  multicast).  Since  only  those  macroblocks  depicting  motion  above 
the  perceptual  threshold  are  transmitted,  new  participants  receive  only  a  portion  of  the 
current  scene.  Although  the  portion  received  is  the  most  dynamic  region  of  frame  (e.g. 
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the  speaker),  a  speaker  completely  disassociated  from  the  background  yields  an  awkward 
reconstruction. 

Complementing  the  block  updating  scheme  with  an  aging  algorithm  that 
intermittently  forces  macroblock  updates  alleviates  these  problems.  The  general 
principle  is  that  the  coder  monitors  the  number  of  frames  since  each  macroblock  was  last 
encoded  -  its  “age.”  If  a  macroblock’s  age  exceeds  a  predetermined  interval,  that 
macroblock  is  flagged  for  encoding  and  transmission.  Thus,  the  aging  algorithm 
guarantees  a  maximum  period  between  macroblock  updates  and  bounds  both  the  duration 
of  hysteresis  errors  and  the  persistence  of  visual  artifacts  caused  by  the  channel.  Such 
bounding  also  ensures  that  new  participants  to  a  dynamic  multicast  session  can  construct 
an  entire  frame  in  a  timely  fashion. 

Obviously,  forcing  selection  of  macroblocks  for  transmission  that  would  not  have 
been  selected  otherwise  increases  the  bandwidth  requirement,  but  this  impact  can  be 
mitigated  by  the  manner  in  which  the  aging  algorithm  is  implemented.  Important 
considerations  are  the  following.  Simply  increasing  the  interval  for  macroblock  selection 
by  aging  decreases  the  required  bandwidth  but  increases  the  persistence  of  decode  errors 
at  receivers  and  prolongs  the  time  required  for  new  participants  to  receive  a  full  frame. 
Merely  forcing  a  macroblock  update  after  n  frames  have  passed  without  selection  leads  to 
an  undesirable  correlation  in  updates  following  scene  changes.  Although  motion  within  a 
scene  tends  to  disperse  updates  to  some  extent,  a  sufficiently  static  background  region 
would  still  lead  to  correlation  of  a  significant  fraction  of  the  block  updates.  The  worst 
case  is  realized  by  a  scene  change  where  the  new  scene  is  entirely  static,  such  as  an 
overhead  slide.  In  this  case,  the  bit  rate  would  spike  every  n  frames.  In  order  to  avoid 
undesirable  spikes  in  bit  rate,  it  is  advantageous  to  spread  the  number  of  macroblocks 
selected  by  aging  evenly  over  time.  This,  in  turn,  requires  a  scheme  that  ages  each  block 
independently. 

The  aging  algorithm  developed  for  the  coder  implemented  here  tracks  the  age  of 
each  macroblock  indirectly.  Instead  of  counting  the  frames  since  a  given  macroblock 
was  last  updated,  the  coder  maintains  an  update  vector  identifying  the  number  of  frames 
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remaining  until  each  macroblock  must  be  updated.  Each  value,  m,  in  the  update  vector  is 
chosen  from  a  discrete  uniform  distribution  in  the  range  [l,n]  and  is  interpreted  as  an 
update  scheduled  m  frames  in  the  future.  Using  a  uniform  distribution  to  schedule 
updates  smoothes  macroblock  selection  over  n  frames  and  decorrelates  the  selections  due 
to  aging.  Choosing  the  aging  intervals  randomly  also  prevents  events,  such  as  scene 
changes,  from  correlating  updates  and  causing  periodic  spikes  in  bit  rate.  The  specific 
value  chosen  for  n  controls  the  tradeoff  between  the  additional  bandwidth  required  and 
coder  responsiveness.  For  a  given  value  of  n,  the  average  number  of  macroblocks 
selected  through  aging  per  frame,  M.*. ,  is 

(ni-2) 

("+i) 

where  M  is  the  total  number  of  macroblocks  in  a  frame. 

The  final  block  selection  algorithm  is  performed  by  the  function,  m_blk_id_xr.m. 
m_blkjd_xr.m  is  provided  in  Appendix  C,  and  it  incorporates  motion  detection  and  aging 
as  follows.  As  each  frame  is  captured,  the  update  vector  entry  corresponding  to  each 
macroblock  is  decremented  by  one.  As  each  macroblock  is  processed  for  selection  in  a 
given  frame,  the  coder  examines  the  macroblock’s  entry  in  the  update  vector.  If  its 
corresponding  entry  has  reached  zero,  the  macroblock  is  selected  for  transmission. 

Otherwise,  the  distance  metric  is  applied  to  determine  if  the  macroblock  should  be 
selected  due  to  motion.  If  either  criterion  is  met,  a  new  random  entry  is  generated  for  that 
update  vector  position.  The  order  of  these  two  events  is  important.  Since  the  distance 
metric  need  not  be  calculated  if  the  macroblock  is  selected  by  aging,  there  is  net  decrease 
in  the  number  of  calculations  required  to  select  a  macroblock  for  transmission. 

For  the  assumed  VTC  video  format  of  176x144  QCIF,  M  is  99.  At  the  assumed 
frame  rate  of  10  fps,  setting  n  to  20  guarantees  all  macroblocks  are  encoded  within  two 
seconds  and  yields  9.43  as  the  average  number  of  macroblocks  selected  through  aging  per 
frame.  However,  the  true  bandwidth  impact  is  less  than  this  value  since  some  of  the 
macroblocks  selected  via  aging  would  have  been  selected  by  motion.  For  the  test  motion 
video  sequences  considered  here,  the  average  number  of  macroblocks  selected  per  frame 
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increased  to  31.4  (from  24.8  without  aging).  The  payoff  for  this  modest  increase  in 
bandwidth  is  that  all  the  problems  encountered  by  using  the  distance  metric  alone  for 
macroblock  selection  are  bounded  to  this  duration.  The  duration  of  two  seconds 
represents  a  compromise  judged  acceptable  between  the  additional  bandwidth  required 
and  the  desired  coder  responsiveness. 

3.  Layering  Strategy 

Macroblocks  selected  for  transmission  are  decomposed  in  frequency  using  a 
DWT.  Performing  the  selection  process  prior  to  the  transform  reduces  computational 
cost  because  the  transform  is  only  applied  to  those  macroblocks  requiring  transmission. 
The  DWT  was  chosen  since  frequency  decomposition,  as  discussed  earlier,  offers  the 
most  flexibility  in  the  subsequent  populating  of  layers.  However,  the  task  of  determining 
a  consistent,  extendable  scheme  for  determining  an  appropriate  number  of  layers  and  the 
manner  in  which  the  frequency  content  within  each  macroblock  was  to  be  apportioned 
across  those  layers  remains. 

Parker  [9]  proposes  a  set  of  heuristic  guidelines  to  determine  the  appropriate  layer 
assignments  and  bit  allocation.  First,  as  layers  are  to  be  hierarchical  in  importance,  layer 
assignments  should  map  frequency  content  to  that  hierarchy  in  a  manner  consistent  with 
perceptual  importance.  Second,  the  base  layer  must  provide  acceptable  quality,  and  the 
addition  of  each  enhancement  layer  must  provide  a  perceptual  improvement  in  quality. 
Since  the  broad  goal  in  image  or  video  coding  is  to  remove  information  that  is  not 
perceptually  relevant,  transmitting  a  layer  that  provides  no  perceptual  improvement  in 
quality  wastes  bandwidth  and  is,  therefore,  undesirable.  Third,  the  number  of  bits 
assigned  to  each  layer  should  be  substantive  so  that  dropping  a  layer  potentially  decreases 
congestion  in  the  network.  Finally,  the  bandwidth  consumed  by  the  image  data  should 
also  be  sufficient  to  avoid  an  excessive  relative  consumption  by  network  control  symbols 
and  overhead.  This  is  especially  critical  for  low  bit  rate  video. 

The  DWT  chosen  was  the  fast  Haar  transform  (FHT).  The  FHT  has  several 
desirable  properties  with  regard  to  minimizing  coder  complexity.  First,  the  FHT,  as  a 
real  transform,  avoids  the  necessity  for  complex  arithmetic  and  simplifies  storage. 
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Second,  the  FHT  is  not  computationally  demanding,  requiring  only  addition,  substraction, 
and  left-  and  right-shifts  [13].  Finally,  unlike  more  sophisticated  wavelet  transforms,  the 
FHT  does  not  require  extending  or  padding  the  data  set.  However,  the  simplicity  of  the 
FHT  may  lead  to  blocking  artifacts  at  high  compression  levels  since  the  average  and 
detail  calculations  are  confined  only  to  contiguous  pixels. 

Since  the  defining  equations  of  the  FHT  and  their  manner  of  application  to  affect 
a  first  order  frequency  decomposition  were  discussed  previously  in  Section  n.B.2,  only  a 
summary  of  the  first  order  FHT  decomposition  is  detailed  in  Table  III-l.  The  operation 
is  accomplished  via  the  function  fht.m  as  provided  in  Appendix  C.  Again,  higher  order 
frequency  decompositions  may  be  accomplished  by  recursively  applying  the  FHT  to  each 
of  the  subbands  created  by  the  next  lower  order  decomposition. 
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Table  m-l:  Significance  and  Determination  of  DWT  Subbands. 


Although  a  greater  number  of  layers  would  offer  more  flexibility  in  managing 
quality  and  congestion,  the  coder  restricts  the  number  of  layers  to  three.  The  decision  to 
limit  the  number  of  layers  to  three  was  driven  primarily  by  the  constraint  of  a  target  bit 
rate  in  range  of  64-96  kbps.  At  these  bit  rates  and  with  each  layer  consuming  an  equal 
amount  of  bandwidth  overhead,  three  layers  appeared  to  be  the  limit  in  terms  of 
producing  enhancement  layers  that  provided  a  perceptual  improvement  in  quality  while 
maintaining  a  base  layer  that  rendered  an  acceptable  quality. 

Establishing  a  suitable  layered  structure  for  motion  video  sequences  can  be  posed 
as  the  following  conditional,  two-part  problem.  Given  that  n  layers  are  desired, 
determine  the  degree  to  which  a  selected  macroblock  is  decomposed  in  frequency  and  the 
manner  in  which  the  resulting  subbands  are  assigned  into  layers.  Parker  [9]  proposes 
using  a  variant  of  a  split-and-merge  algorithm,  which  was  originally  proposed  by  Diab,  et 
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al.  [25]  to  identify  regions  of  equivalent  activity  in  the  spatial  domain,  and  applying  it  at 
the  macroblock  level  in  the  frequency  domain  to  identify  region  of  similar  energy  and 
perceptual  content.  The  particulars  of  the  implementation  follow,  but  the  essence  of  the 
approach  is  that  a  selected  macroblock  is  uniformly  decomposed  in  frequency  via  the 
FHT,  subbands  of  approximately  equal  variance  are  merged,  and  the  resulting  regions  are 
apportioned  into  individual  layers.  These  steps  were  performed  on  representative  video 
sequences,  and  the  ensuing  layering  stracture  is  implemented  within  the  coder. 

Utilizing  representative  video  sequences,  each  selected  macroblock  is  uniformly 
decomposed  in  frequency  by  recursively  applying  the  FHT  until  the  desired  number  of 
subbands  is  obtained.  The  first  order  analysis  creates  four  8x8  subbands;  the  second 
order  analysis  produces  sixteen  4x4  subbands.  The  third  order  analysis  results  in  sixty- 
four  2x2  subbands.  Extending  the  recursion  to  its  limit  yields  128  subbands  consisting  of 
a  single  point.  In  practical  terms,  a  second  order  analysis  as  shown  in  Figure  111-5  proved 
sufficient  for  three  layers. 
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Figure  111-5:  Second  Order  FHT  Decomposition  of  a  Macroblock. 


Next,  the  variance  of  the  coefficient  set  composing  each  subband  was  determined 
across  all  frames  of  video.  These  variances  were  then  used  as  a  subband  merging  metric. 
Two  benefits  are  afforded  by  employing  subband  variance  as  the  merging  metric.  First, 
with  motion  video,  the  variance  of  coefficient  sets  appear  to  possess  an  inverse 
relationship  to  spatial  frequency  and,  by  extension,  to  perceptual  importance.  That  is,  the 
more  perceptually  relevant  subbands  at  lower  frequencies  exhibit  higher  variances. 
Consequently,  differences  in  variance  provide  a  convenient  mechanism  for  assigning 


34 


subbands  to  a  layered  hierarchy.  Second,  grouping  subbahds  with  similar  variances 
simplifies  coder  architecture  since  each  group  can  employ  a  common  quantizer.  Several 
quantization  algorithms  use  variance  as  an  indication  of  the  dynamic  range  exhibited  by 
coefficients  and  allocate  bits  by  varying  quantizer  step  size  in  inverse  proportion  to 
variance.  The  particulars  of  the  quantization  scheme  employed  are  discussed  in  more 
detail  later. 

The  subband  variances  calculated  from  a  first  order  analysis  of  the  test  video 
sequences  are  shown  in  Figure  in-6.  Those  from  a  second  order  analysis  are  shown  in 
Figure  HI-?.  As  Figure  III-6  illustrates,  subband  variance  provides  a  good  indication  of 
the  energy  concentration  within  each  subband.  For  instance,  since  motion  video  is 
characteristically  lowpass,  the  variance  is  largest  in  the  IX  subband  and  smallest  in  the 
HH  subband.  By  extension,  subband  variance  therefore  serves  as  a  relative  indicator  of 
the  perceptual  importance  among  subbands  -  an  observation  that  suggests  subband 
variance  should  dictate  layer  assignments.  As  Figure  111-7  illustrates,  a  second  order 
analysis  further  separates  the  frequency  content  of  the  subband  to  which  it  is  applied. 

The  IX  subband  decomposes  into  the  LLLL,  LLLH,  LLHL,  and  IXHH  subbands 
containing  the  lowpass,  horizontal,  vertical,  and  diagonal  edge  details,  respectively, 
which  were  previously  lumped  into  the  LL  subband  alone.  Performing  the  second  order 
analysis  of  the  LH  (horizontal  detail)  subband  apportions  energy  in  a  manner  symmetrical 
to  the  original  first  order  analysis;  that  is,  as  the  LH  subband  is  found  in  the  northeast 
quadrant  of  Figure  in-6,  the  energy  distribution  attained  by  the  second  order  analysis  is 
concentrated  in  the  northeast  and  northwest  sub-quadrants  with  a  slightly  greater  energy 
in  the  northeast  sub-quadrant  as  shown  in  Figure  111-7.  Similar  observations  can  be  made 
for  the  HL  and  HH  subbands.  These  characteristics  further  leverage  the  argument  for 
using  subband  variances  to  make  layer  assignments  in  a  hierarchical  manner. 
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Figure  TTT-6:  Subband  Variances  after  a  First  Order  Analysis  (Motion  Video). 
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Figure  II1-7:  Subband  Variances  after  a  Second  Order  Analysis  (Motion  Video). 


After  variance  data  had  been  gathered  for  each  subband  at  the  desired  order  of 
analysis,  the  next  step  was  to  merge  adjacent  subbands  exhibiting  similar  variances  into 
an  entity  termed  a  “partition.”  The  criterion  outlined  by  [9]  is  to  merge  adjacent 
subbands  k\  and  ki  when 

( <7^ 

log-f  <A<T^  an-3) 

where 


(ni-4) 


where  Nb  is  the  total  number  of  subbands  created  by  successive  application  of  the  FHT, 
and  a  min  and  a  max  are  the  minimum  and  maximum  variances,  respectively,  found  among 
all  the  subbands.  Applying  the  merge  algorithm  to  the  subbands  listed  in  Figure  III-7 
results  in  the  partitions  shown  in  Figure  III-8.  Assuming  that  subbands  are  statistically 
independent,  the  variance  of  each  partition  Pk  is  now  simply  the  sum  of  the  variances  for 
the  subbands  ki  comprising  that  partition: 
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Figure  in-8:  Partitions  Resulting  from  Merge  Algorithm  (Motion  Video). 

Next,  these  partitions  are  assigned  to  layers  Lj  until  the  requisite  number  of  layers 
were  created  using  the  set  of  heuristic  rules  proposed  in  [9]. 

Rule  1;  No  layer  may  have  a  greater  variance  than  a  lower  layer.  That  is,  given  N 
layers, 

<><>•■■><• 

Rule  2:  Layers  must  be  populated  in  order  of  increasing  frequency.  A  layer  may 
not  contain  a  partition  of  lower  frequency  content  than  any  layer  below  it. 

Rule  3:  Partitions  that  meet  the  criterion  given  by  Equation  111-3  are  assigned  to 
the  same  layer  even  if  the  partitions  are  non-contiguous. 

Rule  4:  Partitions  are  applied  to  layers  in  a  symmetric  fashion. 

Rule  5:  If  more  than  two  subbands  comprising  a  coarser  subband  remain  as 
partitions  after  merging  and  applying  the  above  rules,  all  of  the  partitions 
comprising  the  coarser  subband  are  merged  together  into  one  partition. 
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Rule  6:  If  one  or  more  partitions  is  moved  between  layers,  as  required  to  achieve 
a  more  balanced  distribution  of  bit  rates  or  quality,  move  the  partition(s)  with  the 
lowest  variance  if  promoting  to  a  higher  layer  and  the  partition(s)  with  highest 
variance  if  demoting  to  a  lower  layer. 

Application  of  these  rules  to  the  partitions  shown  in  Figure  ni-8  culminated  in  the 
final  layering  scheme  for  motion  video  sequences  shown  in  Figure  III-9.  The  base  layer 
is  essentially  a  lowpass-filtered  version  of  the  original  macroblock,  and  the  two 
enhancement  layers  progressively  added  higher  frequency  details.  Additionally,  since  the 
LL  subband  retains  many  of  the  perceptual  properties  of  the  original  macroblock,  it  was 
further  transformed  using  the  2-D  DCT  via  the  function,  dct_ofJht.m.  This  additional 
transform  allowed  the  LL  subband  to  be  processed  using  JPEG-based  quantization  and 
encoding  techniques  to  maximize  retention  of  the  most  perceptually  relevant  information. 

Summarizing  the  process  as  implemented  in  the  coder,  each  macroblock  selected 
for  transmission  is  decomposed  in  frequency  using  the  FHT.  The  LL  subband  is  further 
transformed  via  the  2-D  DCT  as  previously  discussed  and  assigned  to  layer  I.  The  HH 
subband  is  assigned  to  layer  m  in  its  entirety.  The  HL  and  LH  subbands  are  decomposed 
with  a  second  application  of  the  FHT.  The  resulting  subbands  are  then  partitioned  and 
assigned  to  layers  11  and  IH  as  appropriate. 


Figure  in-9  Final  Layering  Scheme  for  Motion  Video  Sequences. 

The  situation  with  static  slide  sequences  consisting  of  text  and  line  drawings  is 
much  different.  Sequences  such  as  these  demonstrate  a  greater  dependency  on  their 


38 


higher  frequency  components  for  perceptual  recognition.  A  hierarchical  layering  scheme 
based  on  the  lowpass  characteristics  of  motion  video  yields  blurred,  indistinct 
reconstructions  until  the  higher  frequency  components  are  included  as  well.  Such  a 
structure,  of  course,  is  contrary  to  the  principle  of  layered  video  transmission.  Therefore, 
a  separate  layering  scheme  is  needed  if  the  video  stream  is  to  include  both  types  of 
sequences. 

The  foregoing  remarks  not  withstanding,  pursuing  the  split-and-merge  algorithm 
still  provides  valuable  insight.  The  variances  of  the  subbands  produced  by  a  first  order 
FHT  analysis  of  the  text  and  line  drawing  static  sequences  are  shown  in  Figure  HI- 10; 
those  of  a  second  order  FHT  analysis  are  shown  in  Figure  III-l  1 .  Compared  to  Figure 
in-6  and  Figure  TTT-7  -  the  analogous  results  for  motion  video  sequences  -  it  is  clear  that 
energy  is  much  more  evenly  distributed  among  subbands  of  static  slides.  In  practical 
terms,  this  implies  a  much  more  complex  relationship  between  variance  and  perceptual 
importance. 


Figure  III-IO:  Subband  Variances  after  a  First  Order  Analysis  (Static  Slides). 


Figure  lll-ll:  Subband  Variances  after  a  Second  Order  Analysis  (Static  Slides). 

Applying  the  split-and-merge  as  before  results  in  the  partitions  identified  in 
Figure  III-12.  Using  the  rules  of  layer  assignment  as  before,  a  reasonable  layering 
scheme  might  assign  partitions  Pi,  Pi,  and  P4  to  layer  I.  However,  this  implementation 
led  to  a  poor  reconstruction.  Even  adding  P3,  P5,  and  Pf,  so  that  the  vast  majority  of  the 
energy  was  included  in  the  reconstructed  image  failed  to  provide  acceptable  quality. 
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Clearly  contrary  to  the  motion  video  case,  variance  alone  is  a  poor  guide  to  determining 
perceptual  relevance. 
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Figure  HI-ll:  Partitions  Resulting  from  Merge  Algorithm  (Static  Slides). 


Producing  an  “effective”  base  layer  required  allocating  a  portion  of  the  frequency 
content  from  each  of  the  8x8  subbands  to  it.  Therefore,  while  maintaining  symmetry  in 
assignment,  those  4x4  subbands  that  exhibited  comparatively  larger  variances  among  the 
four  subbands  derived  from  the  same  parent  subband  were  allocated  to  layer  I.  The 
remaining  subbands  were  divided  between  the  remaining  layers  in  order  of  increasing 
frequency.  The  final  layering  scheme  for  static  slide  sequences  is  shown  in  Figure  111-13. 


Figure  III-13:  Final  Layering  Scheme  for  Static  Slide  Sequences. 


Although  the  final  layering  scheme  does  not  stem  directly  from  the  partitions  in 
Figure  111-12,  continued  analysis  on  them  is  insightful  nonetheless.  Merging  partitions 
with  similar  variance  reduces  the  partitions  to  those  shown  in  Figure  111-14.  The 
difference  in  variances  of  partitions  Pi  and  P2  is  only  slightly  too  large  to  permit  their 
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merging  by  the  criterion  of  Equation  111-3.  The  difference  is  small  enough,  however,  to 
justify  quantizing  both  bands  with  the  same  step  size.  The  simplicity  gained  in 
quantizing  both  bands  together  balances  any  potentially  suboptimal  bit  allocation. 
Therefore,  the  final  partitions  for  quantization  purposes  are  as  given  in  Figure  111-15. 
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Figure  in>14:  Partitions  after  Merging  Similar  Non-Contiguous  Partitions. 
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Figure  111-15:  Partitions  for  Quantizing  Purposes  (Static  Slides). 

4.  Quantization  and  JEntropy  Encoding 

After  the  transform  stage,  the  coefficients  are  quantized  and  subjected  to  entropy 
encoding.  For  motion  video  sequences,  the  process  is  depicted  in  Figure  111-16.  The  base 
layer  (LL  subband)  is  quantized  using  the  luminance  matrix  (Figure  n-6)  via  the  function, 
quantizerjlm,  run-length  encoded  using  a  zigzag  scan  via  the  function  zzjb.m,  and 
converted  to  a  VLC  using  the  luminance  VLC  table  suggested  in  [4].  The  conversion  to  a 
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VLC  is  accomplished  stepwise  via  three  functions:  make_it_compact.m,  parse_Huff.m, 
and  get_bits_Hujf.m.  This  approach  leverages  the  LL  subband’s  retention  of  the  lowpass 
characteristics  of  the  original  macroblock.  The  implementation  is  as  discussed  previously 
in  Sections  KC  and  n.D  with  the  scaling  factor  of  the  quantization  matrix  designated  as 
qi®.  The  remaining  subbands  are  each  quantized  uniformly  using  a  common  step  size  for 
all  its  coefficients.  Using  variance  as  an  indication  of  the  dynamic  range  of  the 
coefficients  within  a  given  subband  and  comparing  to  Figure  III-9,  it  is  reasonable  to 
quantize  all  the  subbands  composing  each  enhancement  layer  with  the  same  step  size. 
Therefore,  the  quantizer  step  size  for  layer  n  is  set  to  qi,  that  of  layer  HI  is  set  to  qs. 
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Figure  III-16:  Quantization  and  Coding  for  Motion  Video  Macroblocks. 


Unlike  the  JPEG  based  coding  of  the  LL  subband,  zigzag  scanning  of  the 
quantized  FHT  coefficients  provided  no  apparent  coding  gain.  Instead,  trials  indicated  a 
simpler  horizontal  raster  scan  was  adequate  for  all  subbands  except  HL.  The  HL  showed 
a  slight  preference  for  a  vertical  raster  scan.  This  seems  rational  given  the  frequency 
detail  represented  in  this  band.  The  scan  orders  are  summarized  in  Table  111-2  where  the 
scan  order  applies  to  the  indicated  subband  as  well  as  its  child  subbands.  (The  LL  entry 
pertains  only  to  the  coding  of  the  static  content  macroblocks  as  discussed  later,  but  it  is 
included  here  for  completeness.)  Horizontal  raster  scans  are  performed  by  the  function. 


*  As  executed  in  the  code,  the  scaling  factor  qj  is  a  parameter  in  quantizer_ll.m.  A  value  of  16  for  qj  means 
no  scaling  of  the  values  in  Figure  11-6.  A  value  smaller  than  16  results  in  finer  quantization  and  less 
quantization  noise.  qi  must  be  positive. 
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raster.m’,  vertical  raster  scans  are  performed  by  the  function,  verticalm.  Upon 
completion  of  the  run-length  encoding  accomplished  by  the  functions 
makeJt_compact.m  and  parse_3D.m,  each  coefficient  is  represented  by  a  set  of 
codewords.  Each  codeword  is  then  mapped  to  an  entry  in  a  custom  VLC  table  via  the 
function  get_bits2.m.  The  VLC  table  is  provided  in  Appendix  A.  Its  structure  mirrors 
the  three-dimensional  (3-D)  event  structure  employed  by  the  H.263  coding  standard  [26], 
and  the  methodology  of  its  creation  is  presented  in  the  next  section. 


-ParentSubbarid 

Raster  Scan 
'-"-V' Order  ■■  ' 

LL 

Horizontal 

LH 

Horizontal 

HL 

Vertical 

HH 

Horizontal 

Table  111-2:  Scan  Order  for  Run-Length  Encoding  Quantized  Coefficients. 

The  quantization  and  encoding  process  for  selected  macroblocks  from  static 
sequences  is  shown  in  Figure  111-17.  This  process  differs  from  the  motion  video  process 
in  three  ways.  JPEG-based  quantization  is  not  used;  all  sixteen  subbands  are  supplied  to 
one  of  three  independent  uniform  quantizers  with  fixed  step  sizes  of  qi,  qa,  and  qa. 
Contrary  to  the  motion  macroblock  scheme,  the  step  sizes  are  not  associated  with  layers, 
but  with  the  partitions  depicted  in  Figure  111-15.  Finally,  the  VLC  table  that  is  provided 
in  Appendix  B  and  customized  for  static  slides  is  used.  The  same  three  functions  cited 
above  accomplish  these  steps  with  the  appropriate  VLC  table  being  employed  via  the 
argument  kind  in  the  function  get_bits2.m  signaling  the  type  of  frame  under 
consideration  —  motion  video  or  static  slide. 
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Figure  ni-17:  Quantization  and  Coding  for  Static  Macroblocks. 

Neither  Figure  HI- 16  nor  Figure  IE- 1 7  indicates  the  presence  of  the  control  signal 
from  the  Control  Unit  shown  in  Figure  in-3.  The  control  signal  allows  manipulation  of 
ql,  q2,  and  q3  as  required  by  a  bit  rate  control  scheme.  Rate  control  is  covered  later. 

5.  Generating  Customized  VLC  Tables 

The  VLC  coding  scheme  mirrors  the  3-D  event  structure  employed  by  the  H.263 
standard.  Each  non-zero  coefficient  is  replaced  by  an  equivalent  event  described  by  three 
RLE  parameters  [26] :  {LAST,  RUN,  LEVEL} ,  where  LAST  indicates  whether  there  are 
any  more  non-zero  coefficients  in  the  current  subband,  RUN  indicates  the  number  of 
successive  zeros  that  precede  the  non-zero  coefficient,  and  LEVEL  represents  the  non¬ 
zero  magnitude  of  the  quantized  coefficient.  Each  event  maps  to  a  VLC  codeword  to 
which  a  sign  bit  is  appended  to  represent  the  sign  of  the  coefficient. 

Using  various  combinations  of  qi,  qz,  and  qs,  a  series  of  representative  motion 
video  test  sequences  was  processed.  The  processing  followed  the  implementation  as 
discussed  thus  far  (except  for  the  last  step  of  mapping  each  codeword  obtained  form  RLE 
to  a  VLC  table.)  The  relative  frequency  of  occurrence  of  each  RLE  codeword  was  then 
used  to  create  a  Huffman  VLC  utilizing  an  optimal  binary-branching  tree.  Since  every 
possible  (LAST,  RUN,  LEVEL}  event  is  not  guaranteed  to  be  formed  by  the  set  of  test 
video  sequences,  the  custom  VLC  table  further  mirrors  the  H.263  standard  in  that  a 
default  codeword  length  of  22  bits  is  used  for  any  event  not  contained  elsewhere  in  the 
table. 


44 


A  Huf&nan  code  offers  the  following  advantages.  The  average  codeword  length 
is  minimized  because  the  more  frequently  occurring  events  are  assigned  shorter 
codewords  (fewer  bits)  while  the  less  frequently  occurring  codewords  are  assigned  longer 
codewords  (more  bits).  Additionally,  each  codeword  is  uniquely  decipherable  ensuring 
that  no  codeword  can  be  a  prefix  to  a  longer  codeword. 

This  same  procedure  is  used  here  for  both  motion  video  sequences  and  static  slide 
sequences.  Again,  using  different  VLC  tables  for  the  two  basic  types  of  video  content  is 
advantageous  because  of  the  inherent  difference  between  fiieir  frequency  content  and, 
consequently,  the  different  makeup  of  the  RLE  codeword  population. 

6.  Rate-Distortion  Relationship 

Compressed  video  is  inherently  variable  bit  rate  since  compression  gain  varies 
with  scene  activity  and  complexity.  However,  transmission  channels  inevitably  require  a 
constraint  on  bit  rate  due  to  finite  channel  capacity  or  QoS  guarantees.  Most  commonly, 
bit  rate  is  constrained  to  maintain  a  constant  rate  or  to  maintain  a  constant  local-average 
bit  rate  over  time.  Many  factors  affect  bit  rate,  but  the  most  important  is  the  tradeoff 
made  between  quantizer  step  size  and  image  fidelity.  A  larger  step  size  results  in  a  lower 
bit  rate  and  a  larger  amount  of  distortion.  Reducing  the  step  size  increases  the  bit  rate  but 
reduces  the  amount  of  distortion.  Rate  control  therefore  requires  evaluation  of  the  rate- 
distortion  relationship  created  by  a  particular  coder  design. 

The  rate  control  problem  may  be  posed  as  a  resource  allocation  problem  in  terms 
of  the  rate-distortion  relationship,  where  the  goal  is  to  minimize  distortion  D  for  a  bit  rate 
R  subject  to  a  bit  rate  constraint  Rc  [24],  i.e.,  min{Z)},  subject  toR<  Re-  The 
corresponding  optimization  problem  is  solved  using  Lagrangian  methods  and  yields  the 
optimal  solution  for  a  particular  rate  constraint  as  a  point  along  the  rate-distortion  curve. 
Figure  III-18  shows  a  typical  rate-distortion  curve  and  an  optimal  solution  for  a  bit  rate  of 
Ro-  While  the  true  rate-distortion  curve  is  guaranteed  to  be  convex  [17],  the  operational 
curve  is  influenced  by  the  coder  design  including  the  motion-detection  scheme,  the 
quantizer  design,  and  lossless  coding  gains.  Therefore,  rate  control  schemes  tend  to  only 
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approximate  the  true  rate-distortion  relationship  when  determining  a  method  for  varying 
quantizer  step  size  to  achieve  the  desired  bit  rate. 


Figure  111-18:  Rate  Distortion  Curve  with  Possible  Optimal  Solution. 

Additionally,  for  a  layered  coder  where  multiple  quantizer  parameters  are 
employed,  the  corresponding  multi-dimensional  aspect  of  the  rate-distortion  curve 
complicates  the  rate-control  problem.  Assuming  distortion  for  each  layer  i  is  additive, 
the  rate  control  problem  becomes  minimizing 

=  (in-7) 

i=0 

subject  to 

(ni-8) 

1=0 

where  N  is  the  total  number  of  layers.  The  assumption  of  additive  distortion  implies  that 
a  decrease  in  rate  requires  a  suitable  decrease  in  all  quantizer  parameters  to  yield  an 
optimal  solution.  However,  since  the  rate-distortion  curves  in  the  operational  coder  are 
not  necessarily  convex,  the  approach  above  does  not  necessarily  give  optimal  results.  An 
alternate,  albeit  heuristic,  approach  proposed  by  Parker  [9]  is  to  simplify  the  control 
problem  by  creating  a  simplilhed,  operational  rate-distortion  curve. 

Considering  the  test  motion  video  sequences,  an  operational  distortion  curve  is 
created  by  first  plotting  total  bit  rate  and  distortion  (measured  by  pSNR)  separately 
through  a  three-dimensional  space  spanned  by  the  set  of  candidate  quantizers.  This 
process  captures  the  operational  effect  of  the  coder  design,  such  as  the  values  of  the 
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quantizer  parameters  {qi,q2,q3}  and  the  VLC  coding  gain  as  well  as  any  interdependence 
between  layers,  on  the  rate-distortion  relationship.  The  result  is  best  described  as  a  four¬ 
dimensional  (4-D)  surface  wherein  both  rate  and  distortion  are  functions  of  a  triplet  of 
quantizer  parameters  {qi,q2,q3}.  Recall  that  the  first  parameter  represents  the  JPEG 
scaling  factor  while  the  remaining  parameters  represent  the  actual  quantizer  step  sizes. 

Next,  the  points  representing  the  pSNR  surface  are  sorted  in  a  descending  order 
and  associated  with  their  corresponding  average  bit  rates  and  quantizer  triplets.  Any 
triplet  set  yielding  a  higher  average  bit  rate  for  the  same  or  lower  pSNR  is  discarded.  The 
result  is  an  implicit  vector  quantization  of  the  operational  3-D  rate-distortion  space.  The 
dimensionality  of  the  operational  rate-distortion  curve  is  therefore  reduced  to  1-D  as 
shown  in  Figure  111-19.  Each  point  on  the  curve  represents  results  from  a  single  optimal 
triplet.  Considering  only  those  quantizer  triplets  associated  with  average  bit  rates  about 
the  target  bit  rates  of  64-96  kbps  and  considering  a  5%  change  in  the  average  bit  rate  of 
the  coarsest  quantizer  triplet  as  a  reasonable  control  step,  the  operational  rate-distortion 
curve  of  Figure  III-19  reduces  to  that  shown  in  Figure  111-20.  The  corresponding 
quantizer  triplets  are  plotted  in  Figure  III-21 .  These  results  indicate  that  an  optimal  rate 
control  scheme  does  not  necessarily  increase/decrease  each  quantizer  parameter  in 
lockstep  as  would  be  expected  if  distortion  in  each  layer  were  independent. 


Figure  in-19:  Operational  Rate  Distortion  Curve  (Motion  Video). 
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Figure  111-21:  Quantizer  Table  Triplet  Values  (Motion  Video). 

The  same  approach  was  followed  for  static  slide  sequences.  However,  a  slightly 
different  behavior  was  observed.  Instead  of  the  expected  continued  decrease  in  MSB  as 
bit  rate  increased  with  finer  quantization,  the  behavior  was  as  depicted  in  Figure  in-22. 
Past  a  limiting  value  of  MSB,  finer  quantization  and  the  related  increase  in  bit  rate 
yielded  no  increase  in  image  fidelity.  After  reconstructing  the  custom  VLC  table  with  the 
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quantizer  triplet  at  that  limiting  point  ({qi:q2:q3}  equal  to  {4:16:16})  the  average  bit  rate 
became  approximately  44  kbps  as  shown.  Since  this  bit  rate  is  below  64  kbps,  no 
additional  bit  rate  control  scheme  was  deemed  necessary;  all  static  slide  sequences  are 
quantized  with  the  same  triplet. 
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Figure  111-22:  Operational  Rate-Distortion  Curve  (Static  Slides). 

7.  Bit  Rate  Control 

This  approach  to  the  rate-distortion  problem  provides  a  potential  method  for  a 
simplified  layered  rate  control  scheme  since  the  set  of  possible  quantizer  parameters  is 
reduced  to  a  far  smaller  set  of  parameters.  Considering  each  triplet  as  an  optimal 
quantizer  state,  any  control  scheme  would  manipulate  the  quantizers  for  each  layer  of  a 
motion  video  sequence  by  selecting  only  entries  from  this  set  via  a  simple  table  lookup. 
Parker  [9]  proposes  two  such  schemes.  One  functions  at  the  frame  level;  the  other 
operates  at  the  macroblock  level.  The  former  was  examined  and  implemented  for  this 
thesis  in  the  coder. 

Using  the  operational  rate-distortion  curve,  a  linear  control  curve  relating  bits  per 
frame  B  to  quantizer  setting  Q  is  created  as  shown  in  Figure  111-23.  The  slope  ABfAQ 
represents  the  average  increment  or  decrement  in  bits  per  frame  with  a  step  change  in  the 
quantizer  table.  Dividing  this  quantity  by  the  average  number  of  macroblocks  selected 
per  frame  in  the  test  sequences,  M  ,  yields  the  desired  control  parameter  P: 
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The  calculated  value  of  P  resulting  from  the  test  motion  video  sequences  was  -1 1 .346. 
This  control  parameter  was  then  used  to  adjust  the  coder  quantizer  setting  in  accordance 
witii  the  following  scheme. 
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Figure  III-23:  Operational  Rate  Control  Curve  (Motion  Video). 

At  call  setup,  the  average  bit  allocation  per  frame  B  is  set  to 

(III-IO) 

where  is  the  channel  bit  rate,  and /is  the  frame  rate.  For  each  new  frame  i,  the 

actual  bit  allocation  from  the  last  frame  (i-l)  is  used  to  estimate  the  bit  allocation  error  or 
deviation  expected  to  result  from  the  current  frame  i  if  the  quantizer  setting  used  in  the 
previous  frame  is  not  changed.  Accounting  for  the  change  in  the  number  of  macroblocks 
selected  between  the  last  and  current  frames,  the  deviation  expected  is: 


AB|„.e.=5- 
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where  Mi  is  the  number  of  macroblocks  selected  for  transmission  in  the  current  frame, 
Mi-i  is  the  number  of  macroblocks  selected  for  transmission  in  the  previous  frame,  and 
B,._j  is  the  number  of  bits  used  in  the  transmission  of  the  previous  frame.  The  required 

change  in  the  quantizer  setting  is  calculated  using  the  deviation  AB  inter,  the  number  of 
macroblocks  selected  for  transmission  in  the  current  frame  Mi,  and  the  control  parameter 
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Here,  [  J  is  the  fixed  integer  operator  that  discards  the  decimal  portion  of  the  result.  The 
result  indicates  the  quantizer  setting  from  the  last  frame  should  be  incremented  or 
decremented  by  Aj2i  •  If  the  quantizer  has  reached  the  upper  or  lower  limit  of  the  table, 
the  value  is  not  changed.  This  quantizer  triplet  selection  scheme  is  accomplished  via  the 
function,  get_qd_entryf.m. 

This  control  scheme  is  only  applicable  for  motion  video;  a  single  quantizer 
parameter  triplet  is  used  for  static  slides.  The  only  exception  is  due  to  a  scene  change. 
When  a  scene  change  is  detected  by  the  coder  (as  defined  in  the  next  section),  so  much  of 
the  frame  is  selected  for  transmission  that  a  spike  in  bit  rate  would  occur  if  quantized 
with  any  of  the  triplets  available  in  the  control  table.  To  avoid  this  undesirable  spike,  the 
first  frame  of  a  new  scene  is  heavily  compressed.  Following  this  initial  frame,  the 
appropriate  quantization  technique  ensues. 


8.  Scene  Change  Detection  and  Scene  Type  Determination 

Since  the  bit  rate  must  be  suppressed  during  a  scene  change  and  the  coder  must 
determine  which  of  two  possible  layering  schemes  to  employ  following  a  scene  change, 
these  criteria  must  be  defined.  The  coder  concludes  that  a  scene  change  has  occurred  if 
the  number  of  macroblocks  selected  exceeds  a  threshold.  This  threshold  was  determined 
from  the  block  selection  statistics  of  the  test  video  sequence  containing  the  most  highly 
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active  content  (though  still  considered  low-motion  video)  of  all  the  test  video  sequences 
examined.  For  this  sequence,  the  average  number  of  macroblocks  selected  per  frame  was 
34.61  and  their  standard  deviation  was  10.19.  The  threshold  was  set  at  three  standard 
deviations  above  the  mean  (65).  The  comparison  operation  is  performed  in  the  main 
code  block,  thesis.m.  A  frame  sequence  is  determined  to  be  static  when  macroblocks 
chosen  for  transmission  result  solely  from  aging.  In  this  case,  the  fixed  quantizer  triplet 
and  the  static  VLC  table  from  Appendix  B  are  used.  Otherwise,  the  sequence  is  deemed 
motion  video,  and  the  bit  rate  control  scheme  discussed  above  is  employed  with  its 
associated  custom  VLC  table  from  Appendix  B.  This  determination  is  performed  within 
the  function,  m_blk_id_xr.m. 

This  chapter  began  with  a  presentation  of  the  three  basic  techniques  available  for 
layered  video  coding.  The  approach  implemented  here,  as  proposed  by  Parker  [9],  is 
frequency-based  and  utilizes  the  FHT  and  the  2-D  DCT  for  motion  video  sequences  and 
the  FHT  alone  for  static  slide  sequences.  Different  frequency  transforms  are  used  due  to 
the  inherent  differences  in  the  perceptual  frequency  content  within  the  two  types  of 
sequences.  The  method  of  implementing  frame  refreshment  was  detailed  as  a  block 
selection  scheme  applied  at  the  macroblock  level  that  captures  perceptual  changes  due  to 
motion  within  a  scene  and  limits  the  duration  of  decoder  errors  at  receivers  by  forcing 
macroblock  updates  via  an  aging  algorithm.  Quantization  and  encoding  techniques  were 
presented  with  the  implementation  including  the  use  of  the  JPEG  standard  and  uniform 
quantization  coupled  with  one  of  two  custom  VLC  tables.  The  issue  of  rate  control  was 
addressed,  and  the  implementation  of  a  scheme  that  reduces  a  4-D  rate  control  surface  to 
a  simple  quantizer  table  lookup  to  control  bit  rate  at  the  frame  level  was  discussed. 
Finally,  the  manner  by  which  the  coder  detects  a  scene  change  and  determines  the  type  of 
sequence  under  consideration  was  delineated. 
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rV.  RESULTS 


This  chapter  presents  some  results  from  a  short  video  segment  consisting  of  100 
frames  of  a  single  speaker  followed  by  50  frames  of  a  presentation  slide  filled  with  line 
diagrams  and  text.  The  Matlab  code  is  contained  in  Appendix  C.  A  sample  frame  from 
each  sequence  is  shown  in  Figure  IV- 1  and  Figure  IV-2.  Each  shows  the  original  frame 
and  the  reconstructed  frame  with  only  the  base  layer  received,  with  the  base  layer  and  the 
first  enhancement  layer  received,  and  with  all  layers  received. 


Original  Frame  Layers  1, 2,  and  3 


Figure  IV-1:  Original  and  Reconstructed  Frames  from  a  Motion  Video  Sequence. 
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Original  Frame 


Layers  1 ,  2,  and  3 


Figure  IV-2:  Original  and  reconstructed  Frames  from  a  Static  Video  Sequence. 

Figure  IV-3  shows  the  bit  rate  traces  for  the  150-frame  layered  video  segment. 
Part  (a)  displays  the  traces  resulting  from  the  constant  motion  video  quantizer  triplet 
({6,12,20})  expected  to  yield  80  kbps  based  on  the  average  bit  rate  resulting  form  all  test 
motion  video  sequences  and  the  fixed  static  sequence  quantizer  triplet.  Part  (b)  displays 
the  traces  resulting  from  the  bit  rate  control  scheme  for  the  motion  video  frames  with  the 
coder  attempting  to  achieve  an  average  80  kbps  and  the  fixed  static  sequence  quantizer 
triplet.  Bit  rate  spike  suppression  is  employed  for  the  initial  frame  of  each  scene  as 
discussed  previously.  The  distribution  of  bit  rate  offered  by  a  layered  video  coder  is 
evident  with  the  bit  rate  ratio  among  layers  being  approximately  5:3:2  for  both  sequences. 
As  congestion  occurs  in  the  network,  the  higher  layers  can  be  dropped  to  combat  the 
congestion  while  maintaining  much  of  the  quality  as  illustrated  in  Figure  IV- 1  and  Figure 
IV-2.  Neglecting  the  initial  frame,  the  average  bits  per  frame  for  the  motion  video 
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sequence  without  the  control  scheme  is  7454  bpf  with  a  standard  deviation  of  1362  bpf. 
With  the  control  scheme,  the  average  and  standard  deviation  are  7988  bpf  and  942  bpf, 
respectively.  As  expected,  the  bit  rate  from  the  static  sequence  is  much  lower  since  the 
bit  rate  results  solely  form  macroblock  aging.  This  illustrates  that  rate  control  is  not  of 
significant  benefit  for  static  sequences. 


Figure  rV-3:  Bit  Rates  for  (a)  Fixed  Quantization  and  (b)  Bit  Rate  Control. 

Figure  IV-4  quantifies  the  progressive  improvement  in  quality  of  the 
reconstructed  video  based  on  pSNR  for  the  same  150-frame  layered  video  segment 
illustrated  in  Figure  IV- 1  and  Figure  IV-2.  At  the  beginning  of  each  sequence,  quality 
ramps  up  over  the  aging  interval  following  a  scene  change.  After  this  period,  quality  is 
observed  to  remain  relatively  flat  for  each  sequence  regardless  of  the  number  of  layers. 
For  the  motion  video  sequence,  the  base  layer  provides  a  smoothed  but  acceptable 
display.  Text  is  not  readable  but  the  speaker’s  movements  are  easy  to  follow.  Adding 
the  first  enhancement  layer  improves  sharpness  and  adds  a  4  dB  improvement  in  pSNR 
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although  small  text  is  still  difficult  to  discern.  The  second  enhancement  layer  only  adds 
1-2  dB  improvement,  but  small  text  is  clearly  readable  and  other  features  with  fine  edges 
are  sharper.  With  static  video,  the  role  of  the  enhancement  layers  is  even  more  dramatic. 
Even  though  most  of  the  macroblock’s  energy  is  included  in  the  base  layer  and 
contributions  from  each  frequency  band  are  included,  the  base  layer  still  shows  much 
softness  although  the  shapes  are  readily  identifiable.  The  first  enhancement  layer  adds  a 
7  dB  improvement  and  dramatically  improves  sharpness.  The  final  layer,  even  though 
the  bit  rate  contribution  is  the  smallest  of  the  three  layers,  almost  doubles  the  pSNR,  and 
the  reconstructed  firame  is  virtually  identical  to  the  original  frame.  Neglecting  the  initial 
frame,  the  average  pSNR  utilizing  all  layers  for  the  motion  video  sequence  without  the 
control  scheme  is  29.5  dB  with  a  standard  deviation  of  1.7  dB.  With  the  control  scheme, 
the  average  and  standard  deviation  are  29.8  dB  and  1.9  dB,  respectively.  Note  that  these 
values  include  the  ramp  up  in  quality  following  the  initial  scene  change.  These  average 
pSNR  values  are  not  directly  comparable,  however,  because  of  the  difference  in  average 
bit  rates;  the  higher  quality,  rate  control  approach  uses  approximately  500  more  bits  per 
firame.  But  since  this  quality  is  achieved  within  the  desired  bit  budget  of  80  kbps,  the  rate 
controller  allows  more  effective  utilization  of  the  available  bandwidth  to  better  image 
fidelity.  The  statistics  obtained  from  the  motion  video  sequence  traces  in  Figure  IV-3 
and  Figure  IV-4  are  summarized  in  Table  IV- 1 . 
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(b) 

Figure  IV-4:  pSNR  for  (a)  Fixed  Quantization  and  (b)  Bit  Rate  Control. 


/^ith  RateCbntn 

5l Without  Rate  Control 

Mean  Bit  Rate  (bpf) 

7998 

7454 

Bit  Rate  STD  (bpf) 

942 

1362 

Mean  pSNR  (dB) 

29.83 

29.51 

pSNR  STD  (dB) 

1.92 

1.74 

Table  IV-1:  Rate  Controlled  and  Uncontrolled  Motion  Video  Sequence  Statistics. 

Additionally,  the  issue  of  the  layered  video  codec’s  resilience  to  bit  errors 
introduced  during  transmission  was  examined  using  the  same  motion  video  sequence  as 
above.  Using  loss  rates  of  10%,  25%,  and  50%,  four  different  case  were  tested.  Case  one 
distributed  the  bit  errors  across  the  layers  in  proportion  to  their  contribution  to  the  total 
bit  rate  and  utilized  zero-order  error  concealment;  the  reconstructed  frame  retained  the 
content  of  the  previous  frame  for  that  portion  lost  during  transmission.  If  the  loss 
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occurred  in  the  base  layer,  the  enhancement  layers  were  neglected,  the  loss  occurred  in 
an  enhancement  layer,  the  reconstruction  was  performed  utilizing  the  base  layer  and  the 
other  enhancement  layer.  Case  two  treated  all  layers  as  a  single  video  stream  and  utilized 
zero-order  error  concealment;  that  is,  a  transmission  loss  was  a  loss  for  all  layers.  Cases 
three  and  four  are  identical  to  cases  one  and  two,  respectively,  except  that  no  error 
concealment  was  used.  Instead  an  information  loss  caused  the  decoder  to  assign  the 
value  of  zero  to  all  coefficients  in  the  affected  layers.  As  Figure  IV-5  illustrates, 
spreading  bit  errors  across  multiple  layers  has  less  negative  impact  on  the  reconstructed 
image  at  high  loss  rates. 


Lossrate 

Figure  rV-5:  Comparison  of  Error  Resilience. 

Considering  static  slide  sequences,  the  relationship  is  more  complicated.  In 
general,  since  the  frame  content  is  entirely  static,  the  best  reconstruction  in  an  error  prone 
environment  is  to  forgo  a  macroblock  update  if  the  update  would  be  made  with  fewer 
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layers  than  the  present  reconstruction  of  that  macroblock.  Further  study  into  the 
implementation  of  such  a  scheme  is  warranted. 

This  chapter  presented  representative  frames  from  a  video  segment  consisting  of 
motion  video  and  static  slides.  For  each  type  of  content,  the  original  frame  and  the 
reconstructions  with  one,  two,  and  three  layers  were  given.  Also  presented  were  plots  of 
the  bits  per  frame  and  pSNR  as  a  function  of  the  frame  sequence  for  both  fixed 
quantization  and  variable  quantization  using  the  bit  rate  control  scheme.  These  plots 
served  to  quantify  the  quality  depicted  in  the  reconstructions  and  to  illustrate  the  bit 
allocation  among  the  layers.  Finally,  the  effect  of  spreading  bit  errors  across  layers 
compared  to  confining  them  to  a  single  video  stream  was  presented. 


V.  CONCLUDING  REMARKS 


A.  CONCLUSIONS 

This  tiiesis  has  presented  the  development  and  implementation  of  a  new  layered 
video  codec  proposed  by  Parker  [9]  that  emphasizes  robust  transmission  of  a  video 
teleconference  (VTC)  at  low  bit  rates.  The  dual  nature  of  the  targeted  scene  content  - 
low-motion  video  and  static  slide  sequences  consisting  of  text  and  line  drawings  -  and 
the  assumed  application  in  a  multicast,  heterogeneous,  wireless  network  environment 
required  significant  flexibility  in  the  implementation.  The  essential  features  of  the  coder 
are  summarized  as  follows. 

Frame  refreshment  is  accomplished  via  block  updating  and  an  aging  algorithm, 
both  applied  at  the  macroblock  level.  This  approach  promotes  greater  robusmess  because 
spatial  error  propagation  is  eliminated  and  temporal  error  propagation  is  greatly  limited. 
The  combined  technique  captures  perceptual  changes  due  to  motion  within  the  scene, 
limits  the  duration  of  error  artifacts  in  the  reconstruction  at  receivers,  and  ensures  that 
new  participants  in  a  VTC  session  that  is  already  in  progress  receive  a  complete  frame  in 
a  timely  maimer. 

The  macroblocks  selected  for  transmission  are  decomposed  in  frequency  using  the 
fast  Haar  transform  (FHT).  For  motion  video  sequences,  the  lowpass  subband  is  further 
processed  with  the  two-dimensional  discreet  cosine  tranform.  The  horizontal  and  vertical 
edge  detail  subbands  are  further  decomposed  with  a  repeated  application  of  the  FHT. 
Static  slide  sequences  are  decomposed  solely  by  a  second  order  FHT  analysis. 

The  lowpass  subband  of  motion  video  is  quantized  and  encoded  using  the  JPEG 
standard  in  order  to  exploit  the  human  visual  system  perceptive  characteristics.  The 
remaining  subbands  of  motion  video  sequences  are  subjected  to  uniform  quantization  and 
encoding  with  a  custom  variable  length  coding  (VLC)  table.  All  subbands  of  static  slides 
are  subjected  to  xiniform  quantization  and  encoding  with  a  separate  custom  VLC  table. 

All  quantization  is  performed  using  a  triplet  of  quantizers,  and  each  subband  is  quantized 
with  one  of  the  triplet  parameters  based  on  the  variance  of  subband  coefficients. 
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Subbands  are  assigned  to  layers  by  grouping  bands  of  coefficients  with  similar 
variances  into  a  common  layer.  Three  layers  are  used  in  the  coder.  The  base  layer  is 
independently  decodable  and  yields  an  acceptable,  minimum-quality  reconstruction;  each 
enhancement  layer  progressively  improves  the  quality  of  the  reconstruction. 

Bit  rate  is  controlled  at  the  frame  level  by  selecting  the  quantizer  triplet  to  be  used 
in  the  current  frame  based  on  the  number  of  bits  used  in  the  previous  frame,  the  desired 
average  bit  rate,  and  the  number  of  macroblocks  selected  for  transmission.  The 
implementation  involves  a  simple  table  lookup,  which  resulted  from  the  optimal  one¬ 
dimensional  reduction  of  a  four-dimensional  control  surface. 

B.  FUTURE  WORK 

The  implementation  and  the  results  presented  here  suggest  that  the  layered  video 
codec  has  potential  practical  utility  to  video  teleconferencing  in  multicast,  heterogeneous, 
wireless  networks.  Now,  several  aspects  of  the  coder  can  be  pursued  further.  For 
example,  although  three  layers  were  used  in  the  present  implementation,  the  techniques 
employed  can  be  used  to  scale  the  coder  to  include  an  arbitrary  number  of  layers.  Work 
on  techniques  to  dynamically  change  the  layering  scheme  within  a  sequence  is  desirable. 
The  ability  to  handle  color  and  audio  needs  to  be  incorporated  into  the  code.  Further 
refinement  of  the  block  search  pattern  utilized  for  macroblock  selection  and  the 
possibility  of  rate  control  at  the  macroblock  level  can  be  evaluated.  Also,  with  regard  to 
static  slide  sequences,  investigation  of  the  ability  to  detect  and  reconstruct  slight 
movement  within  such  a  frame,  such  as  the  movement  of  a  cursor,  is  warranted.  Finally, 
implementation  in  a  high  level  language  or  hardware  is  another  potential  future  task. 
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APPENDK  A.  MOTION  VIDEO  VARIABLE  LENGTH  CODING  TABLE  (VLC) 


The  following  is  the  custom  VLC  table  used  with  motion  video  sequences.  The 

last  character  in  the  codeword  s  indicates  the  appended  sign  bit. 

BITS  CODEWORD 


INDEX 

LAST 

RUN 

LEVEL 

1 

0 

0 

1 

2 

0 

0 

2 

3 

0 

0 

3 

4 

0 

0 

4 

5 

0 

0 

5 

6 

0 

0 

6 

7 

0 

1 

1 

8 

0 

1 

2 

9 

0 

1 

3 

10 

0 

1 

4 

11 

0 

1 

5 

12 

0 

1 

6 

13 

0 

2 

1 

14 

0 

2 

2 

15 

0 

2 

3 

16 

0 

2 

4 

17 

0 

3 

1 

18 

0 

3 

2 

19 

0 

3 

3 

20 

0 

3 

4 

21 

0 

4 

1 

22 

0 

4 

2 

23 

0 

4 

3 

24 

0 

4 

4 

25 

0 

5 

1 

26 

0 

5 

2 

27 

0 

5 

3 

28 

0 

5 

4 

29 

0 

6 

1 

30 

0 

6 

2 

31 

0 

6 

3 

32 

0 

6 

4 

33 

0 

7 

1 

34 

0 

7 

2 

35 

0 

7 

3 

36 

0 

7 

4 

37 

0 

8 

1 

38 

0 

8 

2 

39 

0 

8 

3 

40 

0 

8 

4 

41 

0 

9 

1 

42 

0 

9 

2 

43 

0 

9 

3 

44 

0 

10 

1 

45 

0 

10 

2 

46 

0 

10 

3 

47 

0 

10 

5 

3  11s 

5  0101s 

6  10010s 

9  10100011s 

12  10111001101s 

14  1011111001110s 

5  0100s 

7  100111s 

10  001000101s 

12  10111110000s 

17  1011010010000111s 

14  101101001 0100s 

5  0000s 

8  1011001s 

12  01110001010s 

13  001000110110s 

6  01111s 

9  011001 OOS 

11  1011100101s 

13  101101011001s 

6  0001  1  s 

9  01110010s 

10  101101010s 

15  00101101001111s 

6  10010s 

10  000101101s 

12  101111 10001s 

16  101101001000111s 

7  011101s 

10  000101 1 10s 

14  0111000100000s 

16  1  01  1  01  001  0001  1  Os 

7  011011s 

10  001000100s 

14  1011111001111s 

16  101101001000001s 

7  011000s 

9  00100100s 

10  001011011s 

13  011  1  0001  0001  s 

8  0010111s 

11  001  001  1  01  0s 

14  1001101011 OOOS 

8  1011011s 

11  1  001  1  01  001  s 

14  0010011011010s 

16  101101001000000s 
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48 

0 

11 

1 

49 

0 

11 

2 

50 

0 

11 

3 

51 

0 

12 

1 

52 

0 

12 

2 

53 

0 

12 

3 

54 

0 

12 

4 

55 

0 

13 

1 

56 

0 

13 

2 

57 

0 

13 

3 

58 

0 

14 

1 

59 

0 

14 

2 

60 

0 

14 

3 

61 

0 

15 

1 

62 

0 

15 

2 

63 

0 

16 

1 

64 

0 

16 

2 

65 

0 

17 

1 

66 

0 

17 

2 

67 

0 

18 

1 

68 

0 

19 

1 

69 

0 

20 

1 

70 

0 

20 

2 

71 

0 

21 

1 

72 

0 

22 

1 

73 

0 

22 

2 

74 

0 

23 

1 

75 

0 

23 

2 

76 

0 

24 

1 

77 

0 

24 

2 

78 

0 

24 

3 

79 

0 

25 

1 

80 

0 

25 

2 

81 

0 

26 

1 

82 

0 

26 

2 

83 

0 

27 

1 

84 

0 

27 

2 

85 

0 

28 

1 

86 

0 

28 

2 

87 

0 

29 

1 

88 

0 

29 

2 

89 

0 

30 

1 

90 

0 

30 

2 

91 

0 

31 

1 

92 

0 

31 

2 

93 

0 

32 

1 

94 

0 

32 

2 

95 

0 

33 

1 

96 

0 

33 

2 

97 

0 

34 

1 

98 

0 

34 

2 

99 

0 

35 

1 

100 

0 

35 

2 

101 

0 

36 

1 

102 

0 

36 

2 

103 

0 

37 

1 

9  00101100s 

12  00100011100s 

14  1011010010101s 

8  0010000s 

10  011100011s 

13  1011111001 00s 

15  00010110011101s 

9  10111000s 

12  10111111111s 

15  00101101001 100s 

9  00100101s 

13  001000110111s 

17  1011010010011010s 

10  011001010s 

14  1011010001011s 

10  101000100s 

15  10011011110010s 

11  0001011110s 

17  oil  001  01  1  001  1  1  1  1  s 
11  1  01  0001  01  1  s 

11  1  01  0001  01  0s 

11  1011010011s 

17  0110010110011110s 

12  01100101101s 

12  01110001011s 

16  101101001011101s 

12  00010111111s 

15  00101101001101s 

11  0010001100s 

16  101101001011100s 

17  1011010010000100s 

12  00100110011s 

16  101101001011111s 

12  10111111001s 

17  1011010000101000s 

12  00010110000s 

17  1011010000101011s 

12  00100011110s 

17  1011010000101010s 

12  10111111000s 

17  1011010010010101s 

12  10110101110s 

15.  1  01  1  01  00001  000s 

12  10011011100s 

17  1011010010010100s 

12  10011010101s 

14  1011 010001 010s 

13  1011111111 00s 

17  1011010010010001s 

13  001000111110s 

17  1011010010010000s 

13  1011111 10100s 

17  1011010010010011s 

12  10011010100s 

17  1011010010010010s 

13  000101111100s 
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104 

0 

38 

1 

105 

0 

39 

1 

106 

0 

40 

1 

107 

0 

40 

2 

108 

0 

41 

1 

109 

0 

41 

2 

110 

0 

42 

1 

111 

0 

43 

1 

112 

0 

43 

2 

113 

0 

44 

1 

114 

0 

45 

1 

115 

0 

46 

1 

116 

0 

46 

2 

117 

0 

47 

1 

118 

0 

48 

1 

119 

0 

48 

2 

120 

0 

49 

1 

121 

0 

50 

1 

122 

0 

51 

1 

123 

0 

52 

1 

124 

0 

53 

1 

125 

0 

53 

2 

126 

0 

54 

1 

127 

0 

54 

2 

128 

0 

55 

1 

129 

0 

56 

1 

130 

0 

56 

2 

131 

0 

57 

1 

132 

0 

57 

2 

133 

0 

58 

1 

134 

0 

59 

1 

135 

0 

60 

1 

136 

0 

61 

1 

137 

0 

62 

1 

138 

1 

0 

1 

139 

1 

0 

2 

140 

1 

0 

3 

141 

1 

0 

4 

142 

1 

0 

5 

143 

1 

1 

1 

144 

1 

1 

2 

145 

1 

1 

3 

146 

1 

1 

4 

147 

1 

1 

5 

148 

1 

2 

1 

149 

1 

2 

2 

150 

1 

2 

3 

151 

1 

2 

4 

152 

1 

3 

1 

153 

1 

3 

2 

154 

1 

3 

3 

155 

1 

3 

4 

156 

1 

4 

1 

157 

1 

4 

2 

158 

1 

4 

3 

159 

1 

4 

4 

12  10111001110s 

12  101101 0001 1 s 

13  101111111001s 

16  001011010010100s 

13  101110011110s 

17  101 101001 0011101s 

14  0111000100100s 

13  011001011000s 

17  1011010010011100s 

13  1011 10010001s 

12  10110101101s 

14  0111000100111s 

16  101101001011110s 

13  100110111011s 

12  10011010000s 

15  00010110010000s 

13  011001011111s 

13  101101000011s 

14  1011 1001 10010s 

14  0111000100110s 

14  1  001  1  01  01  1  1  00s 

16  101101001011001s 

14  1011111001100s 

16  001011010010111s 

14  0001011001101s 

13  101111111 OOOs 

15  00010110010011s 

14  0001011001100s 

16  001011010010001s 

15  10011011110011s 

16  011001011001110s 

16  000101100111000s 

17  0001011001110010s 

15  00101101000000s 

5  001  1s 

7  011010s 

8  1011000s 

11  0010  011000s 

15  00101101001001s 

6  1  0001  s 

10  101111101s 

13  101111110101s 

15  10110100001011s 

17  1011010010000110s 

7  001010s 

11  0010110101s 

13  100110111101s 

16  101101001000101s 

6  10101s 

10  100110110s 

13  001001100100s 

16  001011010010000s 

7  101001s 

12  00100110111s 

16  101101 001 01 1000s 

16  101101001000100s 
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160 

1 

5 

1 

161 

1 

5 

2 

162 

1 

5 

3 

163 

1 

6 

1 

164 

1 

6 

2 

165 

1 

7 

1 

166 

1 

,  7 

2 

167 

1 

7 

3 

168 

1 

8 

1 

169 

1 

8 

2 

170 

1 

8 

3 

171 

1 

9 

1 

172 

1 

9 

2 

173 

1 

9 

3 

174 

1 

10 

1 

175 

1 

10 

2 

176 

1 

11 

1 

177 

1 

11 

2 

178 

1 

11 

3 

179 

1 

12 

1 

180 

1 

12 

2 

181 

1 

12 

3 

182 

1 

13 

1 

183 

1 

13 

2 

184 

1 

14 

1 

185 

1 

14 

2 

186 

1 

14 

3 

187 

1 

15 

1 

188 

1 

15 

2 

189 

1 

15 

3 

190 

1 

15 

4 

191 

1 

16 

1 

192 

1 

16 

2 

193 

1 

17 

1 

194 

1 

18 

1 

195 

1 

19 

1 

196 

1 

20 

1 

197 

1 

20 

2 

198 

1 

21 

1 

199 

1 

22 

1 

200 

1 

23 

1 

201 

1 

23 

2 

202 

1 

24 

1 

203 

1 

24 

2 

204 

1 

25 

1 

205 

1 

26 

1 

206 

1 

27 

1 

207 

1 

28 

1 

208 

1 

29 

1 

209 

1 

30 

1 

210 

1 

31 

1 

211 

1 

31 

2 

212 

1 

32 

1 

213 

1 

32 

2 

214 

1 

33 

1 

215 

1 

34 

1 

8  1011110s 

15  00010110010001s 

15  00010110010010s 

8  0001001s 

13  100110101101s 

8  0110011s 

12  10011011111s 

14  1001101011001s 

8  0001010s 

13  101110010010s 

16  101101001011011s 

9  01110011s 

15  00101101000011s 

17  1011010010011000s 

8  1001 100s 

15  00101101000010s 

8  1010000s 

14  1001101011110s 

17  1011010010011011s 

8  1011101s 

13  001011010001s 

16  101101001011010s 

9  00100111s 

14  0111000100001s 

9  0111 0000s 

18  0001 0110011100111s 

17  1011 010010000101s 

8  0001000s 

13  oil  00101 1110s 

14  0001 011111011s 

15  00101101001110s 

12  10110101111s 

16  001011010010101s 

13  oil  001  01  1  1  00s 

14  0001011111010s 

13  001000110101s 

13  001001101100s 

18  000101 10011100110s 

14  1011111111011s 

13  001000111111s 

12  00010110001s 

15  10110100001001s 

14  0110010110010s 

17  1011010000101001s 

13  101101011000s 

14  01 100101 11011s 

14  1011100110011s 

14  1011 01000001 1 s 

13  101101000000s 

13  100110100011s 

13  10111001 1000s 

17  1011010010010111s 

14  1011111111010s 

17  1011010010010110s 

15  01100101100110s 

14  0110010111010s 


66 


216 

1 

35 

1 

217 

1 

36 

1 

218 

1 

37 

1 

219 

1 

38 

1 

220 

1 

39 

1 

221 

1 

40 

1 

222 

1 

41 

1 

223 

1 

42 

1 

224 

1 

43 

1 

225 

1 

44 

1 

226 

1 

45 

1 

227 

1 

46 

1 

228 

1 

47 

1 

229 

1 

.  48 

1 

230 

1 

48 

2 

231 

1 

49 

1 

232 

1 

50 

1 

233 

1 

51 

1 

234 

1 

52 

1 

235 

1 

53 

1 

236 

1 

54 

1 

237 

1 

54 

2 

238 

1 

55 

1 

239 

1 

55 

2 

240 

1 

56 

1 

241 

1 

56 

2 

242 

1 

57 

1 

243 

1 

58 

1 

244 

1 

59 

1 

245 

1 

60 

1 

246 

1 

61 

1 

247 

1 

62 

1 

248 

1 

63 

1 

14  0111000100101s 

14  0010001110101s 

14  001001 1011011s 

13  101110011111s 

13  001001100101s 

14  0010001110100s 

13  100110100010s 

14  100110111 1000s 

14  0010001110111s 

15  000101 10010101s 

14  1011010000010s 

13  101110010000s 

13  101111100101s 

13  001000110100s 

17  1011010010011111s 

14  1001101011101s 

15  00101101000001s 

14  1011111001101s 

15  00010110010100s 

13  100110111010s 

14  101101 0001001s 

17  ,  1011010010011110s 
12  10111111011s 

17  1011010010011001s 

12  10111111101s 

16  001011010010110s 

13  101110010011s 

14  000101 1001111s 

15  00010110010111s 

15  000101 10010110s 

14  1011010001000s 

14  1001101011111s 

14  0010001 110110s 
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APPENDK  B.  STATIC  SLIDE  VARIABLE  LENGTH  CODING  TABLE  (VLC) 


The  following  is  the  custom  VLC  table  used  with  static  slide  sequences.  The  last 
character  in  the  codeword  s  indicates  the  appended  sign  bit. 


INDEX 

LAST 

RUN 

LEVEL 

BITS 

CODEWORD 

1 

0 

0 

1 

4 

101s 

2 

0 

0 

2 

4 

1  10s 

3 

0 

0 

3 

5 

1  000s 

4 

0 

0 

4 

7 

111100s 

5 

0 

0 

5 

7 

001110s 

6 

0 

0 

6 

6 

00110s 

7 

0 

0 

7 

13 

011011010011s 

8 

0 

0 

8 

10 

011101011s 

9 

0 

0 

10 

11 

1111101011s 

10 

0 

0 

11 

11 

1  001  000100s 

11 

0 

0 

13 

11 

1  001  1  01  000s 

12 

0 

0 

20 

15 

00111101111 100s 

13 

0 

0 

24 

12 

00111101 100s 

14 

0 

0 

28 

11 

0110110101s 

15 

0 

0 

32 

9 

00001100s 

16 

0 

0 

36 

9 

011111 00s 

17 

0 

0 

40 

8 

0000100s 

18 

0 

0 

44 

8 

1001100s 

19 

0 

0 

48 

6 

0001  1  S 

20 

0 

0 

52 

8 

0110011s 

21 

0 

0 

56 

8 

0010101s 

22 

0 

0 

60 

9 

00100010s 

23 

0 

0 

64 

4 

010s 

24 

0 

1 

1 

6 

1 1 1 00s 

25 

0 

1 

2 

6 

0001  0s 

26 

0 

1 

3 

8 

0111011s 

27 

0 

1 

4 

10 

111111111s 

28 

0 

1 

5 

10 

001111110s 

29 

0 

1 

6 

9 

001 00001s 

30 

0 

1 

7 

16 

00111101111 0010s 

31 

0 

1 

8 

12 

1001 00101 1 1 S 

32 

0 

1 

10 

14 

01  1  1  1  01  001  000s 

33 

0 

1 

13 

15 

1  1  1  1  1  1  1  00001  01  s 

34 

0 

2 

1 

7 

011100s 

35 

0 

2 

2 

7 

011010s 

36 

0 

2 

3 

8 

0010011s 

37 

0 

2 

4 

11 

0111101 100s 

38 

0 

2 

5 

9 

10010011s 

39 

0 

2 

6 

9 

1001 11 10s 

40 

0 

2 

7 

17 

0011110111111110s 

41 

0 

2 

8 

12 

00101001110s 

42 

0 

2 

10 

14 

111111 1000001s 

43 

0 

2 

11 

15 

1111111001 1000s 

44 

0 

2 

13 

14 

IT  1  1  1  1  100001  1  s 

45 

0 

3 

1 

8 

0111100  S 

46 

0 

3 

2 

8 

0110000s 

47 

0 

3 

3 

9 

00001101s 
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48 

0 

3 

4 

49 

0 

3 

5 

50 

0 

3 

6 

51 

0 

3 

7 

52 

0 

3 

8 

53 

0 

3 

10 

54 

0 

3 

11 

55 

0 

3 

13 

56 

0 

4 

1 

57 

0 

4 

2 

58 

0 

4 

3 

59 

0 

4 

4 

60 

0 

4 

5 

61 

0 

4 

6 

62 

0 

4 

8 

63 

0 

5 

1 

64 

0 

5 

2 

65 

0 

5 

3 

66 

0 

5 

4 

67 

0 

5 

5 

68 

0 

5 

6 

69 

0 

5 

8 

70 

0 

6 

1 

71 

0 

6 

2 

72 

0 

6 

3 

73 

0 

6 

4 

74 

0 

6 

5 

75 

0 

6 

6 

76 

0 

6 

10 

77 

0 

7 

1 

78 

0 

7 

2 

79 

0 

7 

3 

80 

0 

7 

4 

81 

0 

7 

5 

82 

0 

7 

6 

83 

0 

7 

7 

84 

0 

7 

10 

85 

0 

8 

1 

86 

0 

8 

2 

87 

0 

8 

3 

88 

0 

8 

4 

89 

0 

8 

5 

90 

0 

8 

6 

91 

0 

8 

8 

92 

0 

8 

10 

93 

0 

9 

1 

94 

0 

9 

2 

95 

0 

9 

3 

96 

0 

9 

4 

97 

0 

9 

5 

98 

0 

9 

6 

99 

0 

10 

1 

100 

0 

10 

2 

101 

0 

10 

3 

102 

0 

10 

4 

103 

0 

10 

5 

11  00001  1  1  000s 

10  001001010s 

10  001111101s 

17  0011110111111001s 

13  100100010111s 

15  111111100111 00s 

15  11111110011011s 

16  011011010010111s 

8  0110001s 

8  0111111s 

9  00100100s 

10  100110101s 

11  0000111001s 

9  00100011s 

15  111111 10010000s 

10  011111010s 

9  01101100s 

10  011110101s 

12  1001111 1001s 

12  00111100010s 

11  1  1  1  1  1  01  000s 

15  11111110010011s 

9  11111011s 

9  00001010s 

10  100110111s 

13  111111101111s 

11  1  001  001  000s 

12  01111011011s 

15  11111110011111s 

10  111101000s 

10  001000000s 

11  1  001  1  01  001s 

15  01  1  01  1  01  001  01  0s 

16  001001011000011s 

13  011101010000s 

16  001111011111101s 

15  11111110011110s 

9  00101000s 

9  00101111s 

10  100110110s 

11  0010010111s 

12  10011111 000s 

10  011110111s 

15  1  1  1  1  1  1  1  001001  Os 

15  11111110011001s 

9  11110101s 

9  10010000s 

11  001  01  001  1  0s 

15  001111011 10001s 

11  1111010010s 

13  111111101110s 

12  00111101101s 

11  001  000001  1  s 

13  001000001011s 

15  00111101110011s 

13  001111010011s 
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111110101001s 

11111110011101s 

1001111111111s 

0110110111s 

11110100111s 

11111110010110s 

00111101111101s 

11111110010001s 

001  1  1  1 0001  1  S 

01  1  1  1  01  0000  s 

0011110100011s 

100100010101s 

00111101111 0001s 

01 1 10101001  Os 

100111111110s 

00101001111s 

001000001  01  Os 

00111101110010s 

001111011110011s 

111111 10010100s 

01  1  101010001  Is 

11101s 

00000s 

100101s 

00001011s 

00001111s 

0110111s 

10010001 01 1 00s 

001000001 00s 

11111010010s 

111110101011s 

111111 1000000s 

00111 1010010s 

001111011101s 

10010010110s 

0011110111101s 

001111 100s 

00100101101s 

01111010011s 

111110101010s 

0110010s 

11111 00s 

1111110s 

00001 1101s 

00111101 00010s 

00111 1010000s 

001001011001s 

0011110111111111s 

01 1 10100s 

10011101s 

0011110010s 

01111011010010s 

01 1 10101001 1 s 

01110101 00010s 

0010010110001s 

111111 10000100s 


160 

1 

3 

1 

161 

1 

3 

2 

162 

1 

3 

3 

163 

1 

3 

4 

164 

1 

3 

5 

165 

1 

3 

6 

166 

1 

3 

7 

167 

1 

3 

11 

168 

1 

4 

1 

169 

1 

4 

2 

170 

1 

4 

3 

171 

1 

4 

4 

172 

1 

4 

5 

173 

1 

4 

6 

174 

1 

5 

1 

175 

1 

5 

2 

176 

1 

5 

3 

177 

1 

6 

1 

178 

1 

6 

2 

179 

1 

6 

3 

180 

1 

6 

5 

181 

1 

7 

1 

182 

1 

7 

2 

183 

1 

7 

3 

184 

1 

8 

1 

185 

1 

8 

2 

186 

1 

8 

4 

187 

1 

8 

6 

188 

1 

9 

1 

189 

1 

9 

2 

190 

1 

9 

3 

191 

1 

9 

4 

192 

1 

9 

5 

193 

1 

10 

1 

194 

1 

10 

2 

195 

1 

11 

1 

196 

1 

11 

2 

197 

1 

11 

3 

198 

1 

11 

4 

199 

1 

11 

5 

200 

1 

11 

6 

201 

1 

12 

1 

202 

1 

12 

2 

203 

1 

12 

3 

204 

1 

12 

4 

205 

1 

12 

5 

206 

1 

12 

6 

207 

1 

13 

1 

208 

1 

13 

2 

209 

1 

13 

3 

210 

1 

14 

1 

211 

1 

14 

2 

212 

1 

14 

3 

213 

1 

15 

2 

214 

1 

15 

3 

8  0010110s 

8  1111011s 

9  10011 100s 

12  011111011 00s 

11  1001111101s 

11  0011111110s 

17  0011110111111000s 

15  11111110011010s 

11  001111 0000s 

10  001010010s 

13  011110110101s 

14  1  001  0001  01  1  01  s 

17  0011110111111101s 

14  0111101001001s 

10  001011100s 

10  111111110s 

14  1001111111110s 

11  1  001  001  01  0s 

12  11111110110s 

15  01111011010011s 

14  1111111 0001 1 1 s 

10  001011101s 

10  100100011s 

14  0111101 101000s 

11  1001111110s 

12  01110101010s 

16  011011010010110s 

14  1111111000110s 

11  0111110111s 

12  01  1  1  1  01  0001  S 

14  0110110100100s 

15  001111011 10000s 

16  001001011000010s 

11  0011110011s 

15  00100101100000s 

11  1111111010s 

11  0011110101s 

12  10011111110s 

14  1111111000100s 

17  001  1  1  1  01  1  1  1  1  1  1  00s 

13  111110101000s 

11  0011111111s 

11  0110110110s 

12  11110100110s 

13  100100010100s 

16  001111011110000s 

13  011110100101s 

12  01101 101000s 

12  11111010011s 

15  11111110010101s 

12  01110101011s 

11  1  001  001  001  s 

15  11111110010111s 

12  01111101101s 

14  1111111 000101 s 
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APPENDIX  C.  MATLAB  CODE  LIBRARY 


This  appendix  contains  the  Matlab  code  used  in  the  layered  video  codec.  The 
main  code  block,  thesis.m,  is  provided  first,  and  the  supporting  functions  follow  in 
alphabetical  order.  As  provided,  thesis.m  processes  four  video  sequences:  two  motion 
video  sequences  of  100  frames  each,  followed  by  two,  static  slide  sequences  of  50  frames 
each. 


%  thesis 

format  compact 
clear  all 
close  all 


global  QD  TABLE  VLC_DYN  VLC_STA  RV  HUFF__TABLE  LAST 


load ( *  g : \QD_TABLE ' ) 
load (* g:\VlC_dyn* ) 
load (’g:\VLC_sta* ) 

VLC_DyN  =  VLC_dyn;  clear  VLC_dyn 
VLC_STA  ^  VLC_sta;  clear  VLC_sta 
slides 

rand ( ’state* ,  0) 

RV  =  floor (21*rand(99, 1) ) ; 
HUFF_TABLE  =  make_HAC_table; 

LAST  =  ones {99, 1) ; 
threshold  -  160; 
count  =  -1; 
mse  =  1; 
display  =1; 
write  =1/ 
last  =  299; 
toff  =  0; 

frame_type  -  0; 
m_mat_ndx  =  [ ] ; 
selected) 

f_vec  =  zeros (1584, 16) ; 

f_far_ll  =  double (zeros (792, 8) ) ; 
f_far__lh  =  double  (zeros  (792, 8) ) ; 
f_far__hl  -  double  (zeros  (792, 8) )  ; 
f_far_hh  =  double (zeros (792,8)); 

f_far_ll_pl  =  double (zeros (792, 8) ) ; 
f_far_lhjpl  =  double (zeros (792, 8) ) ; 
f_far_hl_pl  =  double (zeros (792,8)); 
f _f ar_hh_pl  =  double ( zeros (792,8)); 

f_far_ll_p2  =  double (zeros (792, 8) )  ; 
f_far_lh_p2  =  double (zeros (792,8)); 
f  _f  ar_hl_p2  =  double { zeros (792,8)); 
f_far_hh_p2  =  double (zeros (792,8) ) ; 

f__l  =  double  (zeros  (1584, 16) )  ; 
f_2  =  double (zeros (1584, 16) ) ; 
f  3  =  double (zeros (1584,16)); 


%  gets  custom  VLC  table  for  motion  video 
%  gets  custom  VLC  table  for  static 


%  seed  for  reproductivity 
%  for  aging  algorithm 
%  gets  JPEG  standard  table 
%  sets  block  selection  to  1  initially 
%  for  use  with  asd 
%  where  is  it  in  the  loop? 

%  calculate  MSE? 

%  show  the  images? 

%  write  to  file  for  evaluation? 

%  last  frame  #  to  consider 
%  intialize  (used  with  scene  change) 

%  initialize  (zero  means  dynamic  scene) 
%  initialize  (track  macroblocks 

%  initialize  (reshaped  frame) 

%  initialize  for  decoder 
%  initialize  for  decoder 
%  initialize  for  decoder 
%  initialize  for  decoder 

%  initialize  for  decoder 
%  initialize  for  decoder 
%  initialize  for  decoder 
%  initialize  for  decoder 

%  initialize  for  decoder 
%  initialize  for  decoder 
%  initialize  for  decoder 
%  initialize  for  decoder 

%  initialize  for  decoder 
%  initialize  for  decoder 
%  initialize  for  decoder 
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beta  =  -11.346; 


%  slope  of  rate  control  curve 


br  =  input {'Enter  the  target  bitrate  (Kbps)  \n  »  ' ) 

B_bar  =  br  *  1024/10; 

qd.entry  =  min(find(B_bar  >  QD_TABLE( : ,  4) ) ) ;  %  intial  triplet  based  on  test  averages 


for  i  =  0:last  %  loop  through  frames 

count  =  count+1 
if  {i<=99) 

i.-Present  =  get_next_image(i)  ; 
elseif  (  (i>=100)  &  (i<=199)  ) 

i_present  =  get_nGxt_image_2  (i-100)  ; 
elseif  (i==200) 

i-J?resent  =  get_next_image(101) ; 
elseif  (i==250) 

i_present  =  get_next_image(102)  ; 
else 

i_present; 

end 

if  display  %  diplays  original  if  desired 

figure; 

sxibplot  (2,2, 1) 
image ( i_present ) 
colonnap(gray{256) ) 
title ([' Original '] ) 
axis  off 

end 


%  read  head#, bmp 
%  read  ncaa#.bmp 
%  read  busy  text  slide 
%  read  line  drawing  slide 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Coder  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


%%%%%%%%%%  Change  dimensions  of  image  and  identify  macroblocks  by  threshold  %%%%%%%% 


f_last  =  f_vec;  %  buffers  previous  frame 

f_vec  =  shape  (i_present) ;  %  shapes  current  frame 

[m_vecjidx,frame_type]  =  m__blk_id_xr  (f_last,  f_vec,  threshold) ;  %  compare  current  to 
last 

in_mat_ndx  =  [m_mat_ndx,nuvec_n^] ;  %  matrix  of  MB's  selected 


if  (sum(m_vec_ndx)  >=  65) 

toff  =  1; 

end 


%  triggers  a  scene  change 
%  "trigger"  flag  set  "on" 


if  (toff) 
toff  =  0; 
flag  =  1; 
ql  =  64; 
q2  =  1000; 
q3  =  1000; 
elseif  frame^type 
ql  =  4; 
q2  =  16; 
q3  =  16; 


%  scene  change  just  occured 
%  resets  "trigger"  flag 
%  "flag"  for  1st  frame  after  scene  change 
%  heavily  compressed  scene  change 


%  triplet  for  static  sequence 


®lse  %  a  dynamic  frame  sequence 

delta_Binter  =  B_bar  -  {sum(m_vec_ndx) /sum(m_inat_ndx( : ,  i) ) )  *  total (i); 
qd_entry  =  get_qd_entryf  {flag,B_bar,qd_Gntry,delta_Binter,  sum(m_vec_ndx)  ,beta)  ; 
flag  =0;  %  resets  flag 

ql  =  QD_TABLE(qd_entry,l) ;  %  rate  control  triplets  fetched 

q2  =  QD_TABLE{qd_entry,2) ; 
q3  =  QD__TABLE{qd__entry, 3)  ; 

end 


Q  =  get_Ql_matrix(ql) ; 


%  quantizer  matrix  (via  JPEG  standard) 


if  -frame_type 


%  if  dynamic  frame  sequence 
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%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Transforms  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
tf_fht_ll,f_fht„lh,  f_fht_hl,f_fht_hh  ]  =  fht  (f_vec,m_vec_ndx,  16) ;  %  FHT  of  frame 


f_fht_ll  =  f_fht_ll  -  128 7 

f-fht_ll  =  dct_of_fht(f_fht_ll,m_vec_nc3x) ; 

[f_fht_lh_ll,  f_fht_lh_lh,  f_fht_lluhl,  f_fht_lh_hh] 
f ht  ( f_fht_lh, itL_vec_ndx,  8 ) ; 

[f„fht_hl_ll,  f_fht31-.lh,  f_fht_hl_hl,  f_fht_hl_hh] 
f ht  ( f_f ht_hl ,  m_vec_ndx,  8 ) ; 


%  level  shift  of  LL 
%  2_D  DCT  of  LL 

%  snbband  LH  FHT 

%  subband  HL  FHT 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Quantizing  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


f _f ht_l l_q  =  quant i z er_l  1  { f _f ht_l  1 ,  Q ,  m__vec_ndx ) ;  %  quantize  LL 


f_fht_hh_q  =  round  ( f_fht„hh/q3 ) ; 


f-fht_lh_ll_q 

f_fht_lh„lh_q 

f_fht_lh„hl_q 

f_fht_lh_hh_q 


round  ( f  _f  ht_lh_ll  /q2 ) 
round  ( f  _f  ht_lh_lh/q2 ) 
round  ( f  _f  ht„lh_Jil  /q3 ) 
roxind  ( f_f  ht_lh_hh/q3 ) 


f_fht_hl_ll„q 

f_fht_hl_lh_q 

f_fht_hl-hl_q 

f_fht_hl-hh_q 


round  ( f_f  ht_hl_ll  /q2 ) 
round  ( f_f  ht_hl_lh/q3 ) 
round  ( f_f  ht_hl_hl  /q2 ) 
round  ( f_fht_hl_hh/q3 ) 


%  quantize  HH 
%  quantize  of  LH  subbands 


%  quantize  of  HL  subbands 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%^%%%%%%%%%%^ 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  LL  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


%  zig-zag  scans  each  LL  8x8  (Results  are  one  6336x1  vector  [ll_zz]  and  the  index 
%  of  the  last  non-zero  entity  [last_ll_zz]  .  Get  "inf  if  a  group  of 

%  64  is  all  zeros.) 

Elast_ll_zz,ll_zz3  =  zzb(f_fht_ll_q, 8)  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size.) 
a_ll_zz  =  inake_i t^compact  ( 1  l_z z ,  las t_ll_z z ,  8 ) ; 


%  parsing  LL  with  Huffman  routine 
parsed_ll_zz  =  parse_Huff  (a_ll_zz,  last_ll_zz) ; 

%  gets  bits  per  frame  due  to  LL 

bits_ll_zz (i+1)  =  get_bits_Huff (parsed_ll_zz) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  HH  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


%  scans  each  HH  8x8  (Results  are  one  6336x1  vector  [hh_r3  and  the  index  of  the  last 
%  non-zero  entity  [last_hh_r3 ,  where  r  indicates  horizontal  raster 

%  the  scan  method.  Get  "inf"  if  a  group  of  64  is  all  zeros.) 

[last_hh_r,hh_r]  =  raster  (f_fht_hh_q,  8) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hh„r  =  make_it_compact  (hh_r,  last_hh_r,  8) ; 

%  parsing  HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hh__r_3D  =  parse_3D(a_hh_r,  last_hh_r)  ; 
l_P__hh_r_3 D  ( i+ 1 )  =  length  { par sed_hh_r_3  D  ( : ,  1 ) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

Ebits_hh_r_3D{i+l) ,  hh_r_22  (i+1)  3  =  get_bits2  (parsed_hh_r_3D,  frame_type) ; 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  LH  subbands  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
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%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_LL  4x4  (Results  are  one  1584x1  vector  [lh_ll_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_ll_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf”  if  a  group  of  16  is  all 

%  zeros . ) 

[last_lh_ll„r,lh_ll_r]  =  raster  (f_fht_lh_ll_q, 4)  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_l  l_r  =  make_i t_compact  ( lh_l  l_r  i  las  t_lh_l  l__r ,  4 )  ; 

%  parsing  LH_LL  and  getting  the  nximber  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_ll_r_3D  =  parse_3D(a_lh_ll_r,  last_lh_ll_r)  ; 

1  _P-lh_l  l_r_3D  ( i+ 1 )  =  length  ( par sed_lh_l  l_r_3  D  ( : ,  1 ) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_lh_ll_r_3D(i+l) ,  lh_ll_r_22  (i+1)  ]  =  get_bits2  (parsed_lh_ll_r_3D,  frame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_LH  4x4  (Results  are  one  1584x1  vector  [lh_lh_r3  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_lh_r3 #  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

[  las t_lh_lh_r ,  lh_lh^r ]  =  ras  ter  ( f _f  ht_lh_lh_q ,  4 )  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_lh_r  =  make_it_compact  ( lh_lh_r ,  last_lh_lh_r ,  4 ) ; 

%  parsing  LH_LH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_lh_r_3D  =  parse_3D(a_lh_lh_r,  last_lh_lh_r)  ; 
l-.P-_lh_lh_r_3 D  ( i+1 )  =  length  (par sed_lh_lh_r_3 D  ( : ,  1 ) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_lh_lh_r_3D  { i+1 ) ,  lh__lh_r_22  ( i+1 )  ]  =  get_bits2  (parsed_lh_llx_r_3D,  f rame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_HL  4x4  (Results  are  1584x1  vector  [lh_hl_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_hl_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

[ las t__lh_hl_r , lh_hl_r ]  =  raster  ( f _f ht_lh_hl_q ,4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_hl_r  =  make_it_compact  { lh_hl_r ,  las t_lh__hl_r ,  4 )  ; 

%  parsing  LH^HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_hl_r_3D  =  parse„3D(a_lh_„hl_r,  last_lh_hl_r)  ; 
l_p_lh_hl_r_3D  ( i+1 )  =  length  (parse<i_lh_hl_r_3D  ( :,!)); 

%  gets  bits  per  frame  and  niimber  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_lh_hl_r_3D(i+l) ,  lh_hl_r_22  (i+1)  ]  =  get_bits2  (parsed_lh_hl_r_3D,  frame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_HH  siibband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_HH  4x4  (Results  are  1584x1  vector  [lh_hh_r]  ^d  the  index  of 
%  the  last  non-zero  entity  [last_lh_hh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

[  las  t_lh__hh_r ,  lh_hh_r  ]  =  ras  ter  ( f  _f  ht_lh_hh_q  ,4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
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a_lh_hh_r  =  make_i t_compac t  ( lh_hh_r ,  las t_lh_hh_r ,  4 ) ; 

%  parsing  LH_HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_hh_r_3D  =  parse_3D(a_lh_hh_r,  last_lh_hh_r)  ; 
l_p_lh_hh_r_3 D  ( i +1 )  =  length  ( par s ed_lh_hh_r_3  D  ( : ,  1 ) )  ; 

%  gets  bits  per  frame  and  nxunber  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_lh_hh_r_3D{i+l) ,  lh_hh_r_22  (i+l)  ]  =  get_bits2  (parsed_lh_hh_r_3D,  frame_type)  ; 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%^%%%%%%%%%%%%%^ 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  HL  subbands  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_LL  4x4  (Results  are  a  1584x1  vector  [hl__ll_v]  and  the  index  of 
%  the  last  non-2ero  entity  [last_hl_ll_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.  Get  "inf  if  a  group  of  16  is  all 

%  zeros . ) 

[  last_hl_ll_v,  hl_ll_v3  =  vertical  ( f_f ht_hl_l l_q,  4 )  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_ll_v  =  make_it_compact(hl_ll_v,  last_hl_ll_v,4)  ; 

%  parsing  HL_LL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  nxjmber  is  used. 

parsed_hl_ll_v_3D  =  parse_3D(a_hl_ll_v,  last_hl_ll_v)  ; 
l_P_hl_l l_v_3  D  ( i+ 1 )  =  length  ( pars ed_hl_l  l_v_3  D  ( : ,  1 ) ) ; 

%  gets  bits  per  frame  and  nximber  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_hl_ll_v_3D{i+l) ,  hl_ll_v_22  (i+l)  ]  =  get_bits2  (parsed_hl_ll_v_3D,  frame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL„LH  4x4  (Results  are  a  1584x1  vector  [hl_lh_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_lh_v3 ,  where  the  v  indicates 

%  vertical  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

I last_hl__lh_v,  hl_lh__v]  =  vertical  ( f_f ht_hl_lh_q,  4 )  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_lh_v  =  make_it_compact(hl_lh_v, last_hl_lh_v, 4)  ; 

%  parsing  HL_LH  and  getting  the  nxamber  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hl_lh_,v_3D  =  parse_3D(a__hl_lh_v,  last_hl_lh_v)  ; 
l_p__hl_lh_v_3  D  ( i +1 )  =  length  ( par sed_hl_lh_v_3  D  ( : ,  1 ) )  ; 

%  gets  bits  per  frame  and  nximber  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_hl_lh__v_3D(i+l) ,  hl_lh_v_22  (i+l)  ]  =  get_bits2  (parsed_hl_lh_v_3D,  frame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_HL  4x4  (Results  are  a  1584x1  vector  [hl_hl_v]  and  the  index  of 
%  the  last  non-zero  entity  [  last_hl_hl_v3  /  where  the  v  indicates 

%  vertical  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

[last_hl__hl_v,hl_hl_v]  =  vertical  (f_fht_hl_hl_q,  4)  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_hl_y  =  make_it_compact(hl_hl_v,  last_hl_hl_v,4)  ; 

%  parsing  HL_HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hl_hl_v_3D  =  parse_3D{a_hl_hl_v,  last_hl_hl_v)  ; 
l_p_hl_hl_v_3D  ( i+l )  =  length  (parsed_hl_hl-V_3D  ( : ,  1 ) )  ; 
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%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_bl«hl_v_3D(i+l) ,  hl_hl_v_22  (i+l)  ]  =  get_bits2  (parsed_hl_hl_v_3D,  frame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_HH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_HL  4x4  (Results  are  a  1584x1  vector  [hl_hh_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_hh_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

[ las t_hl_hh_v, hl_hh_v]  =  vertical  ( f_f ht_hl_hh_q ,  4 ) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_hh_v  =  make_it_compact  (hl_hh^v,  last_hl_hh_jv,4)  ; 

%  parsing  HL_HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hl_hh_v_3D  =  parse_3D{a_hl_hh_v,  last_hl_hh_v) ; 
l_p_hl_hh_v_3D  ( i+1 )  =  length  (parsed_hl_hh_v_3  D  ( : ,  1 ) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_hl_hh_v_3D  ( i+1 ) ,  hl_hh_v_22  ( i+1 )  ]  =  get_bits2  (parsed_Jil_hh_v_3D,  f rame_type)  ; 

%  getting  bits  per  frame  per  layer 
layerl ( i+1 )  =  bits_ll_Z2 ( i+1 ) ; 

layer2(i+l)  =  bits_lh_ll_r_3D(i+l)  +  bits_lh_lh_r_3D(i+l)  + _ 

bits_hl_ll_v_3D(i+l)  +  bits_hl_hl_y_3D(i+l) ; 

layer3(i+l)  =  bits_hh_r_3D(i+l)  +  bits_lh_hl_r_3D(i+l)  +  bits_lh_hh_r_3D(i+l)  + _ 

bits_hl_lh_v_3D(i+l)  +  l_p_hl_hh_v_3D(i+l)  ; 
total (i+1)  =  layerl (i+1)  +  layer2(i+l)  +  layer3(i+l); 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%  % 
%  Channel  % 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Unquantize  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


f-.fht_ll„uq 


iinquantize_ll  (f_fht_ll_q,Q,m_vec_ndx)  ;  %  unquantize  LL 


f_fht_hh_ug  =  f_fht_hh_q  *  q3; 


f_fht_lh_ll_uq  =  f_fht_lh_ll_q  *  q2; 
f_fht_lh_lh_uq  =  f_fht_lh_lh_q  *  q2; 
f_fht_lh_hl_uq  =  f_fht_lh_hl_q  *  q3  ; 
f_fht_lh_hh_uq  :=  f_fht_lh_hh_q  *  q3; 

f_fht_hl_ll_uq  =  f_fht_hl_.ll_q  *  q2; 
f_fht_hl_lh_uq  =  f_fht_hl_lh_q  *  q3; 
f-.fht_hl_hl_uq  =  f_fht_hl_hl_q  *  q2; 
f_fht_hl_hh_uq  =  f_fht_hl_hh_q  *  q3; 


%  unquantize  HH 
%  unquantize  of  LH  subbands 


%  unquantize  of  HL  subbands 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Inverse  Transform  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


f_far_ll  invdct_of_fht  (f_fht_ll_uq,m_vec_ndx) ;  %  inv  2-D  dct  of  LL 

f_far_ll  =  f_far_ll  +  128;  %  level  shift 

f_far_lh  =  remake_3  {f_far_lh,  f_fht_lh_ll_uq,  f_fht_lh_lh_uq,  f_fht_lh_hl_uq,  . . . 

f_fht_lh_hh_uq,m_vec_ndx,  8)  ;  %  LH  subband  inv  FHT 

f-.far_hl  =  remake_3  {f_far_hl,  f_fht_hl_ll_uq,  f_fht_hl-lh-uq,  f_fht_hl_hl_uq,  . . . 

f_fht_hl_hh_uq,m_vec_ndx,  8) ;  %  HL  subband  inv  FHT 

f_far_hh  =  f_fht_hh_uq; 

f_far_ll_p2  =  f_far_ll;  %  LL  p2  assignment 

f_far_lh_p2  =  remake_3  (f_far_lh_p2,  f_fht_lh_ll_uq,  f_fht_lh_lh_uq, , . . 

0,0,m_vec„ndx,8) ;  %  LH  p2  subband  inv  FHT 

f_far_hl_p2  =  remake_3  (f_far_hl_p2,  f_fht_hl_ll_uq,  0,  , . . 

f_fht_hl_hl_uq,0,m__vec_ndx,8) ;  %  HL  p2  subband  inv  FHT 
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f_far_hh_p2  =  0; 


f_far_lh_jpl  =  0; 

=  0; 

f_far_hh_j>l  =  0; 


%  HH  p2  assigiunent 

%  LL  pi  assignment 
%  HH  pi  assignment 
%  HH  pi  assignment 
%  HH  pi  assignment 


f_3  =  remake_3  ( f_3 ,  f_f ar_ll ,  f_f ar_lh,  f_f ar_hl , . . . 

f_far_hh,m_vec_ndx,16) ;  %  frame  inv  FHT  with  3 

f_2  =  remake__3  ( f_2 ,  f_f ar_ll_p2 ,  f_f ar_lh_p2 ,  f_f ar_hl^2 , . . . 

f_far_hh_p2,m_vec_ndx,16) ;  %  frame  inv  FHT  with  2 

f_l  =  remake_3  ( f _1 ,  f_f ar_ll_pl ,  f_f ar_lh_jpl ,  f_f ar_hl-Pl  /-•- 

f_far_hh_pl,m_vec_ndx,16) ;  %  frame  inv  FHT  with  1 


layers 

layers 

layer 


else 


%  a  staic  frame  sequence 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Transforms  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


[f_fht_ll,f_fht_lh,f_fht_hl,f_fht_hh  ]  =  fht (f_vec,m_vec_ndx,  16) ;  %  FHT  of  frame 


[f«fht_ll_ll,  f_fht_ll_lh,  f_fht_ll_hl,  f_fht_ll_hh] 
f  ht  ( f  _f  ht_ll ,  m_vec_ndx,  8 ) ; 

Ef_fht_lh_ll,  f«fht_lh_lh,  f^fht^lh^hl,  f_fht„lh_hh] 
f  ht  { f _f  ht„lh ,  m_vec_ndx ,  8 ) ; 

[f-fht^hl^ll,  f_fht_hl_lh,  f_fht_hl_hl,  f_fht_hl.hh] 
f  ht  ( f _f  ht Jil ,  m_vec_ndx  ,8); 

[f_fht_hh_ll,  f_fht_hh_lh,  f_fht_hh_hl,  f_fht__hh_hh] 
f ht  ( f _f ht_hh ,  m_vec_ndx ,  8 )  ; 


%  subband  LL  FHT 

%  subband  LH  FHT 

%  stibband  HL  FHT 

%  subband  HH  FHT 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Quantizing  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


f-.fht_ll_ll_q 

f_fht_ll_lh_q 

f_fht_ll_hl_q 

f_fht_ll_hh_q 


round  ( f_f  ht_ll_„ll  /ql ) 
ro\ind  ( f__fht_ll_lh/q2 ) 
round  ( f  _f  ht_ll_hl  /q2 ) 
ro^ind  ( f_f  ht_ll_hh/q3 ) 


f_fht_lh_ll_q 

f_fht.lh_lh_q 

f_fht_lh_hl„q 

f-fhOh_hh_q 


round ( f_fht_lh_ll/q2 ) 
round  ( f_f  ht_lh_lh/q2 ) 
round ( f _f ht_lh_hl /q3 ) 
round  ( f_f  ht_lh_hh/q3 ) 


f_fht_hl_ll_q 

f^fht_hl„lh„q 

f-.fht_hl_hl_q 

f_fht_hl_hh_q 


round  ( f_f  ht_hl_ll  /q2 ) 
round  ( f_fht_hl_lh/q3 ) 
round  ( f  _f  ht_hl_hl  /q2 ) 
round  ( f_fht_hl^hh/q3 ) 


f-fht_hh_ll_q  =  roxind(f_fht_hh_ll/q3) 
f_fht_hh_lh_q  =  round  (f_fht_hh_lh/q3) 
f_fht_hh_hl_q  =  round  (f_fhtjbh_hl/q3) 
f_fht_hh_hh_q  =  ro\and(f_fht_hh_hh/q3) 


%  quantize  of  LL  subbands 


%  quantize  of  LH  subbands 


%  quantize  of  HL  subbands 


%  quantize  of  HL  siabbands 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  LL  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LL_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LL_LL  4x4  (Results  are  one  1584x1  vector  [ll_ll_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_ll_ll_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  “inf  if  a  group  of  16  is  all 

%  zeros . ) 

[last_ll_ll_r,ll_ll_r]  =  raster  (f_fht_ll_ll_q, 4)  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_ll_ll_r  =  make_it__compact  ( ll_ll_r ,  last_ll_ll_r ,  4 ) ; 
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%  parsing  LL_LL  and  getting  the  nxomber  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  n\ainber  is  used. 

parsed_ll_ll_r_3D  =  parse_3D(a_ll_ll_r,last_ll_ll_r); 
l_p_ll_ll_r_3D{i+l)  =  length  (par  sed_ll_ll_r_3D{  :,!)); 

%  gets  bits  per  frame  and  niimber  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_ll_ll_r_3D(i+l) ,  ll_ll_r_22  (i+1)  ]  =  get_bits2  (parsed_ll_ll_r_3D,  f rame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LL_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LL_LH  4x4  (Results  are  one  1584x1  vector  [ll_lh_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_ll_lh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf  if  a  group  of  16  is  all 

%  zeros . ) 

[last_ll_lh_r, ll_lh__r]  =  raster (f_fht_l l_lh_q, 4)  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_ll_lh_r  =  inake_it_compact  ( ll_lh_r ,  las t_ll_lh_r ,  4 )  ; 

%  parsing  LIi_LH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_ll_lh_r_3D  =  parse_3D(a_ll_lh_r,  last_ll_lh_r)  ; 
l_p_ll_lh_r_3D  ( i+1 )  =  length  (parsed_ll_lh_r_3D  ( :,!)); 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits„ll_lh_r_3D  ( i+1 ) ,  ll_lh_r_2 2  ( i+1 )  ]  =  get_bi ts2  (parsed_ll_lh_r_3D,  f rame_type )  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LL_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LIj_HL  4x4  (Results  are  1584x1  vector  [ll_hl_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_ll_hl_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

[  las t_l l_hl_r ,  1  l_hl_r ]  =  raster  ( f _f ht_l l_hl_q ,  4 )  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_l  l_hl_r  =  make_i t_compact  ( 1  l_hl_r ,  las  t_l  l_hl_r ,  4 ) ; 

%  parsing  LL_HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  niamber  is  used. 

parsed_ll_hl_r_3D  =  parse_3D(a_ll_hl_r,  last_ll_hl_r)  ; 
l_p_ll_hl_r_3D  { i+1 )  =  length  (parsed_ll_hl_r_3D  ( :,!)); 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_ll_hl_r_3D(i+l) ,  ll_hl_r_22  (i+1)  ]  =  get_bits2  (parsed_ll_hl_r_3D,  frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LL_HH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LL__HH  4x4  (Results  are  1584x1  vector  [ll_hh_r]  and  the  index  of 
%  the  last  non-zero  entity  Elast_ll_hh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

( las t_l  l_hh_r ,  1  l_hh_r  ]  =  ras  t er  ( f _f ht_l  l_hh_q ,  4 )  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  va2:ying  size) 
a_ll_hh_r  =  make_it_compact  ( ll_hh_r ,  last_ll_hh_r ,  4 )  ; 

%  parsing  LL_HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_ll_hh_r_3D  =  parse„3D(a_ll_hh_r,last_ll_hh_r); 
l_p_ll_hh_r_3D(i+l)  =  length(parsed_ll_hh_r_3D( : ,  1) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

Ibits_ll_hh__r_3D  ( i+1 ) ,  ll_hh__r_22  { i+1 )  ]  =  get_bi ts2  (parsed_ll_hh_r_3D,  f  rame_type ) ; 
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%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  LH  subbands  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%^%%%%%’^^^^ 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_LL  4x4  (Results  are  one  1584x1  vector  [lh_ll_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_ll_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method-  Get  “inf  if  a  group  of  16  is  all 

%  zeros . ) 

[last_lh_ll_r,lh_ll_r]  =  raster  (f_fht_lh_ll_q,  4)  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_ll_r  =  make_it_compact  (lh_ll_r,  last_lh_ll_r ,  4)  ; 

%  parsing  LH_LL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  nximber  is  used. 

parsed_lh_ll_r_3D  =  parse_3D(a_lh_ll_r,  last_lh_ll_r)  ; 
l_p_lh_l  l_r_3  D  ( i+1 )  =  length  ( par  s  ed_lh_l  l_r_3  D  ( : ,  1 ) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_lh_ll„r_3D(i+l),  lh_ll_r_22  (i+1)  ]  =  get_bits2  (parsed_lh_ll_r_3D,  frame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  riH_LH  4x4  (Results  are  one  1584x1  vector  [lh_lh_r]  and  the  index  of 
%  the  last  non-zero  entity  Elast_lh_lh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf  if  a  group  of  16  is  all 

%  zeros.) 

[last_lh_lh_r,lh_lh_rl  =  raster  (f_fht_lh_lh_q, 4)  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_lh_r  =  make_it_compact  (lh_lh_r ,  last_lh_lh_r ,  4 )  ; 

%  parsing  LH_LH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used, 

par sed_lh_lh_r_;3 D  =  par se_3  D  ( a_lh_lh_r ,  las  t_lh_lh_r )  ; 
l_p_lh_lh_r_3D  ( i+1 )  =  length  (parsed_lh_lh_r_3D  ( : ,  1 ) ) ; 

%  gets  bits  per  frame  and  nximber  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_lh_lh_r_3D(i+l) ,  lh_lh_r_22  (i+1)  ]  =  get_bits2  (parsed_lh_lh_r_3D,  frame„type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_HL  4x4  (Results  are  1584x1  vector  [lh_hl_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_hl_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_lh_hl_r,lh_hl_r]  =  raster  (f_fht_lh_hl_q,  4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_hl_r  =  inake_it_compact  ( lh_hl_r ,  last_lh_hl_r ,  4 )  ; 

%  parsing  LH_HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  niimber  is  used, 

pars ed__lh_hl_r_3  D  =  par se_3 D  ( a_lh_hl_r ,  las  t_lh_hl_r )  ; 
l_p_lhJtil_r_3D  ( i+1 )  =  length  (parsed_lh_hl_r_3 D  { : ,  1 ) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_lh_hl_r_3D(i+l) ,  lh_hl_r_22  (i+1)  ]  =  get_bits2  (parsed_lh_hl_r_3D,  frame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_HH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_HH  4x4  (Results  are  1584x1  vector  [lh_hh_r]  and  the  index  of 
%  the  last  non-zero  entity  Elast_lh_hh_r3 ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 


%  zeros . ) 

[  las t_lh_hh_r ,  lh_hh_r ]  =  raster  ( f _f ht_lh_hh_q,  4 )  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_hh„r  =  inake_i t_compac t  { lh_hh__r ,  las t_lh_hh_r ,  4 )  ; 

%  parsing  and  getting  the  nuinber  of  parsings  for  later  used  in  eval_thesis  to 

%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_hh_r_3D  =  parse_3D(a_lh_hh_r,  last_lh_hh„r)  ; 
l_p_lh_hh_r_3 D  ( i+1 )  =  length  { par sed_lh_hh_r_3 D  ( : ,  1 ) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_llx_hh_r_3D(i+l) ,  lh_hh_r_22  (i+1)  ]  =  get_bits2  (parsed_lh_hh_r_3D,  frame_type)  ; 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  HL  siibbands  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_LL  4x4  (Results  are  a  1584x1  vector  [hl_ll_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_ll_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.  Get  "inf  if  a  group  of  16  is  all 

%  zeros . ) 

[ las t_hl_ll_v,  hl_ll_v]  =  vertical  ( f_f ht_hl_ll_q,  4 )  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl__ll_v  =  make_it_compact  (hl_ll_v,  last_hl_ll_v,  4)  ; 

%  parsing  HL_LL  and  getting  the  n\amber  of  parsings  for  later  used  in  eval_H:hesis  to 
%  get  the  percent  of  time  that  the  default  bit  nxamber  is  used. 

parsed__hl_ll_v_3D  =  parse_3D(a_hl_ll_v,  last_hl_ll_v) ; 
l_P-.hl_ll_v_3D(i+l)  =  length  ( par sed_hl__ll_v_3D( : ,  1) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_hl_ll_v_3D(i+l) ,  hl_ll_v_22  (i+1)  ]  =  get_bits2  (parsed_hl_ll_v_3D,  frame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_LH  4x4  (Results  are  a  1584x1  vector  [hl_lh_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_lh_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.  Get  "inf  if  a  group  of  16  is  all 

%  zeros . ) 

[  las t_hl_lh__v,  hl_lh_v]  =  vertical  ( f _f ht_hl_lh_q ,  4 )  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_lh_y  =  make_it_compact  (hl_lh_v,  last_hl_lh_v,4)  ; 

%  parsing  HL_riH  and  getting  the  niimber  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  niimber  is  used. 

parsed_hl_lh_v_3D  =  parse_3D(a_hl_lh_v,  last_hl_lh_v)  ; 
l_P-.hl_lh_v_3D(i+l)  =  length (parsed_hl_lh_v_3D( 1) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_hl_lh_v_3D(i+l),  hl_lh_v_22  (i+1)  ]  =  get_bits2  (parsed_hl_lh_v_3D,  frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_HL  4x4  (Results  are  a  1584x1  vector  [hl_hl_y]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_hl_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

[ last_hl_hl_v,  hl_hl_v]  =  vertical  ( f _f ht_hl_hl_q ,  4 )  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a— hl„hl_v  =  make_it_compact(hl_hl__v,  last_hl_hl_v,4)  ; 
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%  parsing  HL_HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  nxamber  is  used. 

parsed_hl_hl_jv_3D  =  parse_3D(a_hl_hl_v,  last__hl„hl_v)  ; 
l_p_hl_hl_v_3 D  ( i+1 )  =  length  (par sed_hl_hl_v_3 D  ( :  / 1 ) )  ; 

%  gets  bits  per  frame  and  nxamber  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_hl_hl_jv_3D(i+l) ,  hl_hl_v_22  (i+1)  ]  =  get_bits2  (parsed_hl_hl_v_3D,  frame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_HH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_HL  4x4  (Results  are  a  1584x1  vector  [hl_hh_v3  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_hh_v3 ,  where  the  v  indicates 

%  vertical  raster  scan  method.  Get  “inf"  if  a  group  of  16  is  all 

%  zeros . ) 

[  las t_hl  jili_v,  hl_hh_v]  =  vertical  ( f _f ht_hl_hh__q ,  4 )  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_hh_v  =  niake_i t_compac t  { hl_hh_v ,  las  t_hl_hh_v ,  4 )  ; 

%  parsing  HL_HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  niamber  is  used. 

parsed_hl_hh_v_3D  =  parse_3D(a_hl_hh„V/ last_hl_hh_v)  ; 
l_P_bl_hh_v_3 D  ( i+1 )  =  length  (par sed_hl_hh_v_3D  (:,!)); 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_hl_hh_v_3D(i+l) ,  hl_hh_v_22 (i+1) ]  =  get_bits2 (parsed_hl_hh_v_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  HH  subbands  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HH_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HH_LL  4x4  (Results  are  one  1584x1  vector  [hh_ll_r3  and  the  index  of 
%  the  last  non-zero  entity  [last_hh_ll_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros , ) 

[last_hh^ll_r,hh_ll_r3  =  raster (f_fht_hh_ll_q, 4)  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hh_ll_r  =  make_it_compact  (hh_ll_r,  last_hh_ll_r,  4)  ; 

%  parsing  HH_LL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hh_ll_r_3D  =  parse_3D(a__hh_ll_r,  last_hh_ll_r)  ; 
l_P_hh_l  l_r__3 D  { i + 1 )  =  length  ( par sed_hh_l  l_r_3 D  ( : ,  1 ) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_hh_ll_r_3D(i+l) ,  hh_ll_r_22  (i+1)  3  =  get_bits2  (parsed_hh_ll_r_3D,  frame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HH_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  sccuis  each  HH_LH  4x4  (Results  are  one  1584x1  vector  [hh_lh^r]  and  the  index  of 
%  the  last  non- zero  entity  [last_hh_lh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

t  las t_hh_lh_r ,  hh_lh_r  3  =  ras  t er  ( f _f ht_hh_lh_q ,  4 ) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hh_lh_r  =  inake_it_compact  (lih_lh_r,  last_hh_lh_r,  4)  ; 

%  parsing  HH_LH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used, 

par sed_hh_lh_r_3 D  =  parse_3 D  ( a_hh_lh_r ,  las t_tih_lli^r ) ; 
l_P_hh_lh_r_3D  ( i+1 )  =  length  {parsed_hh_lh_r_3D  ( : ,  1 ) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 
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[bits3ii_lh_r_3D(i+l) ,  hh_lh_r_22  (i+1)  3  =  get_bits2  (parsed_hh_lh_r_3D,  frame^type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HH_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HH_HL  4x4  (Results  are  1584x1  vector  [hh_hl_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_hh_hl_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

[ las t_hh__hl_r ,  hh_hl_r ]  =  raster  ( f _f ht_hh_hl_q ,  4 )  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hh_hl_r  =  inake_it_compact  (hh__hl_r,  last_hh_hl_r,4)  ; 

%  parsing  HH_HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hh_hl_r_3D  =  parse_3D{a_hh_hl_r, last_hh_hl_r)  ; 
l_p_hh_hl„r_3D{i+l)  =  length(parsed_hh_hl_r_3D( : ,  1) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_hh_hl_r_3D{i+l) ,  hh_hl_r_22  (i+1)  ]  =  get_bits2  (parsed_hh_hl_r_3D,  frame_type)  ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HH_HH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HH_HH  4x4  (Results  are  1584x1  vector  [hh_hh_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_hh_hh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.  Get  "inf"  if  a  group  of  16  is  all 

%  zeros . ) 

[last_hh_hh_r,hh_hh„r]  =  raster  (f_fht_hh_hh_q,  4)  ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hh_hh_r  =  make_it_compact  (hh_hh_r,  last_hh_hh_r,4)  ; 

%  parsing  HH_HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to  ■ 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

par sed_hh„hh_r„3 D  =  par se_3  D  ( a_hh_hh_r ,  las  t_hh_hh_r )  ; 
l_p_hh_hh_r_3D  ( i+1 )  =  length  ( parsed_hh_hh_r_3 D  ( ; ,  1 ) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

(bits_hh_hh_r_3D(i+l) ,  3ih_hh_r_22  (i+1)  ]  =  get_bits2  (parsed_hh_hh_r_3D,  frame^type)  ; 

%  getting  bits  per  frame  per  layer  and  total 

layerl(i+l)  =  bits_ll_ll_r_3D(i+l)  +  bits_ll_lh_r_3D(i+l)  +  ... 
bits_lh_ll_r_3D(i+l)  +  bits_lh_lh_r_3D  (i+1)  +  — 

bits_ll_hl_r_3D{i+l)  +  bits_hl-ll-V_3D(i+l)  +  ... 

bi  t  s_hh_l  l_r_3  D  ( i + 1 )  +  bi  ts_hl_hl_v_3  D  ( i + 1 )  +  ... 

bi ts_hh_hh_r_3  D ( i+1 ) ; 

layer2(i+l)  =  bits_ll_hh_r_3D(i+l)  +  bits_lh_hl_r_3D(i+l)  +  ... 
bits_lh_hh_r_3D(i+l)  +  bits_hl_lh_v_3D(i+l)  +  ... 
bits31-hh_v_3D(i+l) ; 

layer3(i+l)  =  bits_hh_lh_r_3D(i+l)  +  bits_hh_hl_r_3D(i+l)  ; 
total(i+l)  =  layerl(i+l)  +  layer2(i+l)  +  layer3(i+l); 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Unquantize  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


f_fht_ll_ll_uq 

f_fht_ll_lh_uq 

f«fht_ll_hl_uq 

f_fht_ll_hh_uq 

f^fht_lh_ll_uq 

f_fht_lh_lh_uq 

f-fht_lh_hl_uq 


f_fht_ll_ll_q  *  ql; 
f_fht_ll_lh_q  *  q2; 
f-fhtjl_hl„q  *  q2; 
f_fht_ll_hh_q  *  q3; 

f-.fht_lh_ll_q  *  q2; 
f_fht_lh_lh_q  *  q2; 
f_fht_lh_hl_q  *  q3; 


%  \mquantize  of  LH  subbands 


%  unquantize  of  LH  subbands 
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%  unquantize  of  HL  subbands 


f_fht_lh_hh_uq  =  f_fht_lh_hh_q  *  q3; 

f_fht_hl_ll_uq  =  f_fht_hl_ll_q  *  q2; 
f_fht_hl_lh„uq  =  f_fht31-.lh_q  *  q3; 
f^fht_hl_hl_uq  =  f_fht_hl_hl_q  *  q2; 
f-fht.hl_hh„uq  =  f_fht„hl_hh_q  *  q3 ; 

f_fht_hh_ll_uq  =  f_fht_hh_ll_q  *  q3;  %  imquantize  of  HH  subbands 

f_fht_hh_lh_uq  =  f_fht_hh_lh_q  *  q3; 
f_fht_hh_hl_uq  =  f_fht_hh_hl_q  *  q3; 
f_fht_hh_hh_uq  =  f_fht_hh_hh_q  *  q3; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Inverse  Transform  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

f_far_ll  =  remake_3  (f_far_ll,  f_fht_ll_ll_uq,  f_fht_ll_lh_uq,  f_fht_ll_Jil_uq,  . . . 

f_fht_ll_hh_uq,m_vec_ndx,  8)  ^ 

f_far_lh.  =  remake_3  (f_far_lh,  f. 

f_fht_lh_hh_uq,m_vec_ndx,  8) 
f _f ar_hl  =  reinake_3  ( f_f  ar_hl ,  f . 

f_fht_hljih_uq,m_vec_ndx,  8) 
f_far_hh  =  remake_3  (f_far_hh,  f. 
f_f  ht_hh_hh_uq,  m_vec_ndx,  8 ) 


f_far_ll_p2  =  remake_3  {f_far_ll_p2,  f_fht_ll_ll_uq,  f_fht_ll_lh_uq,  . . . 

f_fht_ll_hl_uq,  f_fht_ll_hh__uq,m_vec_ndx,  8)  ;  %  LL  p2  subband  inv  FHT 

f_far_lh_p2  =  remake_3  (f_far_lh_p2,  f_fht_lh_ll_uq,  f_fht_lh_lluuq, . . . 

f_fht_lh_hl_uq,  f_fht_lh_hh_uq,m_vec_ndx,  8) ;  %  LH  p2  subband  inv  FHT 

f_far_hl_p2  =  remake_3  (f_far_hl_p2,  f_fht_hl_ll_uq,  f_fht_hl„lh_uq, — 

f_fht_hl_hl_uq,  f_fht_hl_hh_uq,m_vec_ndx,  8) ;  %  HL  p2  subband  inv  FHT 

f_far_hh_p2  =  remake_3  ( f_far_hh_p2 ,  f_fht_hh_ll_uq,  0,0,... 

f_fbt_hh_hh_uq,m_vec_ndx,  8) ;  %  HH  p2  subband  inv  FHT 


f_far__ll_pl  =  reniake_3  (f_far_ll_pl,  f_fht_ll_ll_uq,  f_fht_ll_lh_uq, — 

f_fht_ll_hl_uq, 0,m_vec„ndx,8) ;  %  LL  pi  s\ibband  inv  FHT 

f_far_lh_pl  =  remake_3  (f_far_lh_pl,  f_fht_lh_ll_uq,  f_fht_lh_lh_uq, . . . 

0,  0,itL.vec_ndx,  8) ;  %  LH  pi  subband  inv  FHT 

f_far_hl_pl  =  remake_3  (f_far_hl_pl,  f_fht_hl_ll_uq,  0, . . . 

f_fht_hl_hl_uq,  0,m_vec_ndx,  8) ;  %  HL  pi  siibband  inv  FHT 

f_far_hh^l  =  remake_3  (f_far_hh_pl,  f_fht_hh_ll_uq,  0,  . . . 

0,  f_fht_hh_hh_uq,m_vec_ndx,8) ;  %  HH  pi  siibband  inv  FHT 


^  Jjij  sujjxjctxj.u  JLiiv  rni 

fht„lh_ll_uq,  f_'fht_lh_lh__uq,  f_fht_lh_hl_uq, 
%  LH  subband  inv  FHT 

fht_hl_ll_uq,  f_fht_hl_lh_uq,  f_fht_hl_hl_uq, 
%  HL  subband  inv  FHT 

fht_hh_ll_uq,  f_fht_hh_lh_uq,  f_fht_hh_hl_uq, 
%  HH  subband  inv  FHT 


f_3  =  remake_3  (f_3 ,  f_far_ll,  f_far__lh,  f_far_hl,  . . . 

f_far_hh,m_vec_ndx,16) ;  %  frame  inv  FHT  with  3  layers 

f_2  =  remake_3  ( f_2 ,  f _f ar_ll_p2 ,  f_f  ar_lh_p2 ,  f_f ar_hl_p2 ,  . . . 

f_far_hh_p2,m_vec_ndx,  16) ;  %  frame  inv  FHT  with  2  layers 

f_l  =  remake_3  ( f _1 ,  f_far_ll_pl,  f_far_lh_pl,  f_far_hl_jpl,  . . . 

f_far_hh_pl,m_vec_ndx,16) ;  %  frame  inv  FHT  with  1  layer 

end  %  ends  frame_type  des crimination 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Make  Display  Size  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

fig_3  =  shape_back ( f _3 ) ;  .%  make  display  dimensions 

fig_2  =  shape_back ( f _2 ) ;  %  make  display  dimensions 

fig_l  =  shape_back(f_l) ;  %  make  display  dimensions 


L3  =  ro\ind(fig_3) 
L2  =  round {fig_2) 
LI  =  round (fig_l) 


%  for  viewing 
%  for  viewing 
%  for  viewing 


%%%%%%%%%%%%%%%%%%%%%%%  Calculates  the  mean- square-error  per  frame  %%%%%%%%%%%%%%%%% 
if  mse 

MSE_3L{i+l,  1)  =  {sum(s\im(  (ijpresent  -  f  ig_3 )  . ^2)  ) ) /176/144; 

MSE_2L{i+l, 1)  =  (sum(sum( (i_present  -  fig_2) .^2) ) ) /176/144; 

MSE_lL(i+l,  1)  =  (sum(sum{  (i_jpresent  -  fig_l)  .  ""2) ) ) /176/144; 
end  %  end  if  mse 
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if  display 


%  The  following  lines  show  the  figure  with  3  layers 
siibplot  (2,2,2) 
image  (L3) 
axis  off 

title ( 'Layers  1,  2,  and  3') 

%  The  following  lines  show  the  figure  with  2  layers 
s\ibplot  (2,2,3) 
image (L2) 

title (' Layers  1  cind  2') 
axis  off 

%  The  following  lines  show  the  figure  with  1  layer 
subplot (2, 2,4) 
image  (LI) 

title ('Layer  1  Only') 

axis  off 

drawnow 

end  %  ends  if  display 

%  closing  reconstruction  occasionally  to  conserve  memory 

if  { (i==20)  1  (i==40)  I  {i==60)  I  (i=:=80)  |  (i==100)  |  (i==120)  |  ti==140)  |  {i==160)  I  (i=:=180)  |  . 
(i==200) 1 (i==220) 1 (i==240) | (i==260) | (i==280)) 
close  all 

end 

end  %  ends  looping  through  frames 

%  The  following  lines  simply  rename  variables  to  be  consistent  with  an  off-line 
%  evaluation  program  used  in  the  development. 

%  LL  dynamic 

BITS_LL_HUFF  =  bits_ll_zz'; 

%  LL  static 

BITS_LL_LL_3D  =  bitS_ll_ll_r_3D' ; 

SCAN_LL_LL_22  =  ll_ll_r_22 ' ; 

SCAN_LL_LL_LEN  =  l_p_ll_ll_r_3D' ; 

BITS_LL_LH_3D  =  bits_ll_lh_r_3D' ; 

SCAN__LL_LH_22  =  ll_lh_r_22'; 

SCAN_LL_LH_LEN  =  l_p_ll_lh„r_3D' ; 

BITS_LL_HL_3D  =  bitS_ll_hl^r_3D' ; 

SCAN_LL_HL_22  =  ll_hl_r_22 ' ; 

SCAN_LL_HL_LEN  =  lj>_ll_hl_r_3D' ; 

BITS_LL_HH_3D  =  bitS_ll_hh_r.3D' ; 

SCAN_LL_HH_22  =  ll_hh_r_22'; 

SCAN_LL_HH_LEN  =  lj>_ll_hh_r_3D' ; 

%  liH  both  types  of  slide 
BITS_LH_LL_3D  =  bits_lh_ll_r_3D' ; 

SCAN_LH_LL_22  =  lh_ll_r_22  '  ; 

SCAN_LH_LL_LEN  =  l^_lh_ll_r_3D' ; 

BITS_LH_LH_3D  =  bits„lh_lh_r_3D' ; 

SCAN_LH_LH_22  =  lh_lh_r_22  '  ; 

SCAN__LH_LH_LEN  =  l_p_lh_lh_r_3D' ; 

BITS_LH_HL_3D  ==  bits_lh_hl_r_3D' ; 

SCAN_LH_HL_22  =  lh_hl_r_22  '  ; 

SCAN_LH_HL_LEN  =  l_p_lh_hl_r_3D' ; 

BITS_LH_HH_3D  =  bits_lh_hh_r_3D' ; 

SCAN_LH_HH_22  =  lh_hh_r_22'; 

SCAN_LH„HH_LEN  =  l_p„lh_hh_r_3D' ; 

%  HL  both  types  of  slides 
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BITS_HL_LL_3D  =  bitS_hl_ll-.v_3D' 
SCAN_HL_LL_22  =  hl_ll_v_22  '  ; 
SCAN_HL_LL_LEN  =  l_p_hl_ll-V_3D' 

BITS_HL_LH_3D  =:  bits_hl_lh_v_3D' 
SCAN_HL_LH_22  =  hl„lh_v_22  '  ; 
SCAN_HL_LH_LEN  =  l_p_hl_lh_v_3D' 

BITS_HL__HL_3D  =  bits_hl_hl_v_3D' 
SCAN_HL_HL_22  =  hl.hl_v_22'; 
SCAN_HL_HL_LEN  =  l^_hl_hl_v_3D' 

BITS_HL__HH_3D  =  bitS_hl_hh_v_3D' 
SCAN_HL_HH_22  =  hl_hh_v_22'; 
SCAN_HL_HH_LEN  =  l_p_hl_hh__v_3D' 

%  HH  from  dynamic 
BITS_HH_3D  =  bits_hh_r_3D' ; 
SCAN_HH__22  =  hh_r_22'; 
SCAN_HH_LEN  =  1  J_hh_r_3D' ; 

%  HH  from  static 
BITS_HH_LL_3D  =  bits_hh_ll_r_3D' 
SCAN__HH_LL_22  =  hh_ll_r_22'; 
SCAN_HH_LL_XiEN  =  l_p_hh„ll_r_3D' 

BITS_HH_LH_3D  =  bits_hlulh„r_3D' 
SCAN_HH_LH_22  =  hh_lh_r_22'; 
SCAN_HH_LH_LEN  =  l_p_hh_lh_r„3D' 

BITS_HH_HL_3D  =  bits_hh_hl_r_3D' 
SCAN_HH_HL_22  =  hh_hl_r_22'; 
SCAN_HH_HL_LEN  =  l_p_hh_hl_r_3D' 

BITS_HH_HH_3D  =  bits_hh_hh_r_3D' 
SCAN_HH_HH_22  =  hh_bh_r_22'; 
SCAN_HH_HH_LEN  =  l_p_hh„hh_r_3D' 


%  saves  parameters  for  evaluation  later 
if  write 

s  =  char ( 'g:\thesis\BR' ) ; 
s  =  strcat  (s,ni3m2str  (br) )  7 

save  (s ,  'm_mat_ndx' , ' BITS_* ' ,  'MSE_* ' , ' SCAN_* ' ,  '  layer* ' ) 
end  %  if  write 


function  [output]  =  dct_of_fht (input, m__vec_ndx) 

%  performs  the  2D  DCT  of  input.  m_vec_ndx  identifies  where  this  operation  needs 
%  to  be  perfoanned. 

output  =  zeros (792 , 8) ; 
offset  =  -8; 

for  ndx  =  1:99 

offset  =  offset  +8; 

if  m_vec_ndx(ndx) 

output (off set+1: off set+8, : )  =  dct2 (input (of fset+l:offset+8, :)) ; 

end 

end 
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function  [  f _f ht_l  1 ,  f _f ht_lh ,  f _f ht_hl ,  f ht_hh ]  =  f ht  ( f _vec ,  m_vec_ndx ,  in ) 

%  Performs  the  FHT  of  the  appropriate  IN  x  IN  macroblocks  of  F_VEC  as 
%  specified  by  M_YEC_NDX.  Returns  four  sets  of  99  IN/2  x  IN/2  matrices, 

%  each  stacked  into  one  big  99*IN/2  x  IN/2  matrix.  The  places 
%  where  the  fht  was  not  performed  are  filled  with  zeroes  as  place-holders. 

half  =  in/2; 

f_fht_ll  =  zeros (99*half, half )  ; 
f _f  ht_lh  =:  zer os  ( 9  9  *hal  f ,  hal  f )  ; 
f_fht_hl  =  zeros ( 99 *half, half) ; 
f „f ht_hh  -  zeros { 9  9 ♦hal f , hal f ) ; 
mask^lh  =  [1  1;-1  -1] ; 
mask.hl  =  [1  -1;1  -1] ; 
mask_hh  =  [1  -1;-1  1] ; 

offset  =  -in; 


for  ndx  =1:99 

offset=  offset+in; 
oso2  =  offset/2; 

if  m_vec_ndx (ndx) 

for  mdx  =  l:half 

for  cndx  =  l:half 

f_fht_ll  (mdx+oso2 ,  cndx)  =  ... 

sum ( sum { f _vec  (mdx*2-l+offset:rndx*2+offset,cndx*2-l:cndx*2) ) )  /4; 
f_fht_lh(mdx+oso2, cndx)  =  sum ( sum (mask_lh  .♦  ... 

f_vec (mdx*2-l+offset :rndx*2+of fset, cndx*2-l :cndx*2) ) ) /4; 

f_fht_hl (mdx+oso2, cndx).  =  Siam ( s\am (mask_hl  .* _ 

f_vec (mdx*2-l+of fset :rndx*2+of fset, cndx*2-l :cndx*2) ) ) /4; 
f _f ht__hh  ( mdx+oso2 ,  cndx)  =  sum  ( sum  (mask_hh  .  *  ... 

f_vec (mdx*2-l+offset:rndx*2+offset,cndx*2-l:cndx*2) ) ) /4; 


end 

end 


end 


end 


function  [bits]  =  get_bits_Huff (parsed) 

%  Uses  Huffman  table  to  get  bits  for  JPEG-based  compression  of  PARSED, 
global  HUFF_TABLE 
bits  =0; 

[r,c]  =  size (parsed) ; 

for  ndx  =  l:r 

while (parsed (ndx, 1)  >=  15) 
bits  =  bits  +11; 

parsed (ndx, 1)  =  parsed (ndx, 1)  -  15; 
end 

table^row  =  find(  (HUFF_TABLE( ; ,  2)  ==  parsed  (ndx,  1) )  &  ... 

(HUFF_TABLE(:,3)  ==  parsed  (ndx,  2)  ))  ; 
bits  =  bits  +  HUFF_TABLE ( table_row, 4 )  +  parsed(ndx, 2) ; 
end 
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bits  =  bits  +  99*4; 


f\mction  [bits, count]  =  get_bits2  (parsed, kind) 

%  Uses  PARSED  to  fetch  bits  from  custom  VLC  tables 
%  KIND  will  be  1  for  static  scenes 
%  KIND  will  be  0  for  dynamic  scenec 

global  VLC^DYN  VLC_STA 

bits  =  0; 
count  =  0; 

parsed  =  abs (parsed) ; 

[r,c]  =  size (parsed) ; 

if  kind 

for  ndx  =  l:r 

poss_rows  =  find(VLC_STA(:,2)  ==  parsed  (ndx,  1) )  ; 
start  =  poss_rows (1) ; 

poss_rows  =  find(VLC_STA(  (poss_rows  (1)  :poss_rows  (length (poss^rows) ) ) ,  3)  ==. . . 
parsed (ndx, 2) ) ; 

if  - ( isempty (poss_rows ) ) 

poss_rows  =  poss_rows  +  start  -  1; 
start  =  poss_rows ( 1 )  ; 

end 

if  -  ( isempty  {poss_rows ) ) 

poss_rows  =  find  ( VLC_STA  ( ( poss_rows  ( 1 )  : poss_rows  ( length  (poss_rows ) ) ) ,  4 )  == . 
parsed(ndx,3) ) ; 

poss_rows  =  poss_rows  +  start  ~  1; 
end 

if  -  ( isempty  ( po s s_rows ) ) 

bits  =  bits  +  VLC_STA(poss_rows,  5)  ; 
else 

bits  bits  +  22; 
count  =  count  +  1; 
end 

end 

else 

for  ndx  =  l:r 

poss_rows  =  find  (VLC_DYN  ( : ,  2 )  ==  parsed  (ndx ,  1 ) ) ; 
start  =  poss_rows (1) ; 

poss_rows  =  find(VLC_DYN(  (poss_rows(l)  :poss_rows(length(poss_rows) ))  ,3)  ==... 
parsed(ndx,  2) )  ; 

if  ~  ( isempty  (poss_rows ) ) 

poss^rows  =  poss^rows  +  start  -  1; 
start  =  poss_rows (1) ; 

end 

if  -( isempty  (poss_rows ) ) 

poss_rows  =  find  ( VLC_DYN  ( (poss_rows  { 1 )  : poss_rows  ( length  (poss^rows ) ) ) ,  4 )  == . 
parsed (ndx, 3) ) ; 

poss_rows  =  poss_rows  +  start  -  1; 
end 

if  -  ( isempty  ( pos s_rows ) ) 

bits  =  bits  +  VLC_DYN(poss_rows,  5)  ; 
else 

bits  =  bits  +22; 
count  =  count  +  1; 
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end 


end 


end 

bits  =  bits  +  99; 


fimction  [f]  =  get_next_iniage (niam) 

%  Gets  next  frame  as  a  .bmp  file  converts  to  form  needed  in  MATLAB 

s  =  char { ' g : \pictnres \Head ' ) ; 
fn  s=  strcat  (s,num2str  (nim) ) ; 
f  =  imread(fn, 'bmp' ) ; 
f  =  double(f(;,:,l)); 


function  [f]  =  get_next_iinage_2  (num) 

%  Gets  next  frame  as  a  .bmp  file  converts  to  form  needed  in  MATLAB 

s  =  char { 'g: \pictures\ncaa' ) ; 
fn  =  strcat (s,num2str (num) ) ; 
f  =  imread(fn, 'bmp' ) ; 
f  =  double (f 


fiinction  [Q]  =  get_Ql_matrix(ql) 

%  makes  the  JPEG  standard  quantization  matrix  and  multiplies  it  by  ql. 

%  ql  =  16  will  result  in  no  scaling  of  the  matrix  when  coupled  with  the  rest 
%  of  the  code,  ql  <  16  is  finer  quantization,  i.e.  less  quantization  noise  results. 

Q  =  [16  11  10  16  24  40  51  61; 


12 

12 

14 

19 

26 

58 

60  55; 

14 

13 

16 

24 

40 

57 

69  56; 

14 

17 

22 

29 

51 

87 

80  62; 

18 

22 

37 

56 

68 

109 

103  77; 

24 

35 

55 

64 

81 

104 

113  92; 

49 

64 

78 

87 

103 

121  120  101 

72 

92 

95 

98 

112 

100  103  99] 

function  [out]  =  get_qd_entryf (flag, default, entry, delta_Binter,MBnum, beta) 

%  selects  the  appropriate  quantizer  triplet  based  on  the  input  parameters 
%  FLAG  implies  the  first  frame  following  a  scene  change  frame.  DEFAULT  is  the 
%  choice  of  triplet  based  on  test  sequences  and  serves  as  a  starting  point  for  a 
%  new  sequence.  ENTRY  hold  the  table  entry  from  the  previous  frame.  The  remaining 
%  parameters  ara  as  defined  in  the  thesis. 

global  QD_TABLE 

if  (flag) 
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out  =  min  (find  (default  >  QD_TABLE  ( : ,  4 ) ) )  ; 
else 

deltaQ  =  fix(delta_Binter/MBnum/beta) ; 
out  =  entry  +  deltaQ; 
if  (out  >  17) 
out  =17; 

end 

if  (out  <  1) 
out  =  1; 

end 

end 


function  [run]  =  get_run(seql.to_code,  len) 

%  Called  by  the  parsing  functions  PARSE_3D  and  PARSE_HUFF,  this  fxinction  obtains 
%  the  RUN  field  for  RLE. 

run  =  zeros (len, 1) ; 

coxmt  =  0; 
place  =1; 
mdx  =1; 

while  (mdx  <=  length  (seq_t opcode ) ) 
if  seq_to_code  (mdx) 
place  =  place  +  1; 
mdx  =  mdx  +  1; 
else 

while  (seq„to_code(mdx)  ==  0) 
mdx  =  mdx  +  1; 
count  =  count  +1; 

end 

run (place)  =  count; 

count  =  0; 

place  =  place  +  1; 

mdx  =  mdx  +  1;  ’ 

end 

end 


function  [output]  =  invdct_of_fht (input, in__vec_ndx) 

%  Performs  the  inverse  2D  DCT.  M_VEC_NDX  identifies  where  this  operation  needs  to  be 
%  performed. 

output  =  zeros (792,8); 
offset  =  -8; 

for  ndx  =  1:99 

offset  =  offset  +  8; 

if  m_vec_ndx(ndx) 

output (of fset+1: off set+8, : )  =  idct2 (input (off set+1: of fset+8, :)) ;  , 

end 

end 


91 


function  [m__vec_ndx,byAge,b_count,inb_count]  =  ni_blk_id_xr  (fl,  f2,T) 

%  Figures  out  which  16x16  macroblocks  of  the  144x176  image  need  to  be  further 
%  processed  via  an  absolute  sum  of  differences  between  the  last  frame  fl  and 
%  the  current  frame  f2.  m_vec_ndx  is  99x1  vector  of  0  or  1  raster  scanned. 

%  T  is  the  threshold  utilized.  byAge  is  1  if  macroblocks  are  selected  only  due 
%  to  aging.  byAge  is  0  otherwise.  b_count  and  mb_count  were  used  in  refining  the 
%  search  sequence.  b_count  is  "block  count."  mb_coxmt  is  "macroblock  count."  As 
%  are  artifacts  now,  they  are  commented- out. 

global  LAST  RV 

%b_count  =  0; 

%mb_count  =  0; 
recount  =  0; 

m_vec_ndx  =  zeros (99,1); 

for  ndx  =1:99 

if  (RV(ndx)  ==  0) 

m__vec_ndx(ndx)  =  1; 

RV(ndx)  =  floor (21*rand) ; 
else 

RV(ndx)  =  RV(ndx)  -  1; 
end 

end 

byAge  1  =  sum(m_vec_ndx) ; 

for  mdx  =  1:16:1584 

recount  =  recount  +1; 

i  f  ( -'m_vec_ndx  { r_c  ount ) ) 
go  =  1; 

if  ( (IiAST{r_co\jnt)  ==  1)  &  (go)) 
for  cndx  =  1 

f l_8x8  =  f 1 ( rndx ; rndx+ 7 , cndx : cndx+7 ) ; 
f2_8x8  =  f 2  (mdx: rndx+7,  cndx:  cndx+7) ; 
asd  =  abs  (siam(sxim(fl_8x8-f2_8x8) ) )  ; 

if  (asd  >  T) 

in_vec_ndx( recount)  =  1;  %  ID's  position 
%inb_coimt  =  mb_co\mt  +  1; 

%b_count  =  b_count  +  1; 

LAST ( r_count )  =  1 ; 
go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl(mdx+8:mdx+15,cndx+8:cndx+15)  ; 
f  2_8x8  =  f  2  ( rndx+8 :  mdx+15 ,  cndx+8 :  cndx+15 )  ; 
asd  =  abs  (sum(sum(fl_8x8-f2_8x8) ) ) ;  • 


if  (asd  >  T) 

m_vec_ndx  ( r_count )  = 
%mb_count  =  mb_count 
%b_count  =  b_count  + 
LAST(r_count)  =  3; 
go  =  0; 

break;  %  gets  out  of 
end 


1;  %  ID'S  position 
+  1; 

2; 


the  inner  loop  if  justified  now 


f  l_8x8  =  f  1  ( rndx :  mdx+7 ,  cndx+  8 ;  cndx+15 )  ; 
f  2_8x8  =  f 2  ( rndx : mdx+7 ,  cndx+8 : cndx+15 ) ; 
asd  =  abs .{sum(sum(fl_8x8-f2_8x8) ) ) ; 


if  (asd  >  T) 

m_vec_ndx(r_coimt)  =  1;  %  ID's  position 
%mb_count  =  mb_count  +  1; 

%b_count  =  b_count  +  3; 

LAST(r_count)  =  2; 
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go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

f  l_8x8  =  fl  (rndx+8  :mdx+15 ,  cndx:  cndx+7)  ; 
f2_8x8  =  f2  (rndx+8:mdx+15,cndx;cndx+7)  ; 
asd  =  abs  (s^axn(suln(fl_8x8-f2_8x8) ) )  ; 

if  (asd  >  T) 

m_vec_ndx (recount)  =  1;  %  ID's  position 
%inb_co\int  =  mb^count  +  1; 

%b_count  =  b_count  +  4; 

IAST(r_coTint)  =  4; 
go  =  0; 

end 

end  %for  cndx 
end  %  if  LAST 

if  ( (LAST (recount)  ==  2)  &  (go)) 
for  cndx  =1 

f  l_8x8  =  f  1  ( rndx :  mdx+7 ,  cndx+  8 :  cndx+15 ) ; 
f2_8x8  =  f 2 (rndx; rndx+7,cndx+8: cndx+15) ; 
asd  =  abs  (sum(suin(fl_8x8-f2_8x8)  ) )  ; 

if  (asd  >  T) 

in__vec_ndx(r_count)  =  1;  %  ID'S  position 
%mb_count  =  mb^count  +  1; 

%b_count  =  b_count  +  1; 

LAST (recount)  =  2; 
go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

f  l_8x8  =  f  1  (mdx+8 :  rndx+15 ,  cndx :  cndx+7 )  ; 
f2_8x8  =  f 2 (rndx+8: rndx+15, cndx: cndx+7 ) ; 
asd  =  abs  (sum(s\am(fl_8x8~f2_8x8) ) )  ; 

if  (asd  >  T) 

ituvec_nd?c( recount)  =  1;  %  ID's  position 
%inb_count  =  inb_count  +  1; 

%b_count  =  b_count  +  2; 

LAST(r_count)  =  4; 
go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl  (mdx+8: rndx+15,  cndx+8: cndx+15 )  ; 
f 2_8x8  =  f2  (mdx+8 : rndx+15 ,  cndx+8 : cndx+15)  ; 
asd  =  abs  (suin(sum(fl_8x8“f2_8x8) ) )  ; 

if  (asd  >  T) 

m_vec_ndx( recount)  =  1;  %  ID's  position 
%inb_count  =  inb_count  +  1; 

%b_co\ant  -  b_comt  +  3; 

LAST (recount)  =  3; 
go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

f  l_8x8  =  f  1  ( mdx :  mdx+7 ,  cndx :  cndx+7 )  ; 
f  2_8x8  =  f  2  ( mdx :  mdx+7 ,  cndx :  cndx+7 ) ; 
asd  =  abs  (slim (s\am(fl_8x8- f2_8x8) )) ; 

if  (asd  >  T) 

in_vec_ndx(r_co\int)  =  1;  %  ID's  position 
%inb_co\mt  =  mb^count  +  1; 

%b_co\int  =  b_count  +  4; 

.LAST(r_co\mt)  =  1; 
go  =  0; 
end 
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end  %for  cndx 
end  %  if  LAST 

if  ( (LAST (recount)  ==  3)  &  (go)) 
for  cndx  =  1 

fl_8x8  =  fl (rndx+8:rndx+15,cndx+8 :cndx+15) ; 
f2_8x8  =  f2  (rndx+8;mdx+15,cndx+8:cndx+15)  ; 
asd  =  abs ( Siam { sum ( f l__8x8 - f 2_8x8) ))  ; 

if  (asd  >  T) 

in_vec_ndx( recount)  =  1;  %  ID's  position 
%mb_count  =  inb_count  +  1; 

%b_count  =  b_co\int  +  1; 

LAST ( r_count )  =  3 ; 
go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

f l_8x8  =  f 1 ( rndx : rndx+7 , cndx : cndx+7 ) ; 

,f  2_8x8  =  f  2  ( rndx ;  rndx+7 ,  cndx :  cndx+7 )  ; 
asd  =  abs (sum(sum(fl_8x8-f2_8x8) ) )  ; 

if  (asd  >  T) 

m_vec_ndx( recount)  =  1;  %  ID's  position 
%inb_count  =  mb^count  +  1; 

%b_count  =  b_count  +2; 

LAST (recount)  =  1; 
go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

f  l_8x8  =  f  1  (mdx+8 :  rndx+15 ,  cndx :  cndx+7 )  ; 
f 2_8x8  =  f 2  (mdx+8 : rndx+15 ,  cndx :  cndx+7 )  ; 
asd  =  abs (sum(s\am(fl_8x8-f2_8x8) ) )  ; 

if  (asd  >  T) 

m_vec_ndx( recount)  =  1;  %  ID's  position 
%mb_count  =  mb_count  +  1; 

%b_co\mt  =  b_count  +  3; 

LAST  (recount)  =  4; 
go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl(mdx:mdx+7,cndx+8:cndx+15)  ; 
f 2_8x8  =  f 2 ( rndx : rndx+7 , cndx+8 : cndx+15 ) ; 
asd  =  abs  (sum(s\m(fl_8x8-f2_8x8) ) )  ; 

if  (asd  >  T) 

nL_vec_ndx( recount)  =  1;  %  ID's  position 
%mb_count  =  mb_count  +  1; 

%b_count  =  b_count  +  4; 

LAST (recount)  -2; 
go  =  0; 

end 

end  %for  cndx 
end  %  if  LAST 

if  ( (LAST  (recount)  ==  4)  &  (go)) 
for  cndx  =  1 

fl_8x8  =  fl  (mdx+8 : rndx+15,  cndx: cndx+7 )  ; 
f 2_8x8  =  f2 (rndx+8 : rndx+15 , cndx: cndx+7) ; 
asd  =  abs  (sum(s\im(fl_8x8-f2_8x8) ) )  ; 

if  (asd  >  T) 

nuvec_ndx( recount)  =  1;  %  ID's  position 
%inb_count  =  inb_count  +  1; 

%b_count  =  b_count  +  1; 

LAST (recount)  =  4; 
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go  =  0; 

break?  %  gets  out  of  the  inner  loop  if  justified  now 
end 

f l_8x8  =  f 1 ( rndx : rndx+7 , cndx+8 : cndx+15 ) ; 
f2_8x8  =  f 2  (mdx:mdx+7, cndx+8: cndx+15) ; 
asd  =  abs  (suin(suin(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

in_vec_ndx( recount)  =  1;  %  ID's  position 
%mb_count  =  mb^count  +  1; 

%b_count  =  b_count  +  2? 

IiAST(r_coxint)  =  2; 
go  =  0? 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl(mdx:mdx+7,cndx:cndx+7) ; 
f2_8x8  =  f 2  (mdx: rndx+7, chdx:cndx+7) ; 
asd  =  abs  (sim(suni(fl_8x8-f2__8x8) ) )  ? 

if  (asd  >  T) 

ni_vec_ndx(r__co\xnt)  =  1;  %  ID's  position 
%inb_count  =  inb_count  +  1; 

%b_count  =  b_co\int  +  3; 

IiAST(r_count)  =  1; 
go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl(mdx+8:mdx+15, cndx+8: cndx+15 )  ? 
f2_8x8  =  f 2  (mdx+8:mdx+15, cndx+8: cndx+15 ) ; 
asd  =  abs  (suin(sum{fl_8x8-f2_8x8) ) )  ? 

if  (asd  >  T) 

m_vec_ndx( recount)  =  1;  %  ID'S  position 
%nib_count  =  mb^count  +  1; 

%b__count  =  b_count  +  4? 

LAST(r_count)  =  3; 
go  =  0; 
end 

end  %for  cndx 
end  %  if  LAST 

end  %if  -m^vec^ndx 

end  %  for  mdx 

byAge2  =  suin(xti_vec_ndx)  ? 
byAge  =  (byAgel  ==  byAge2); 


fxanction  [table]  =  inake_HAC_table  ( )  ; 

%  Generates  the  Huffman  VLC  table 

vecl  =  ones (10,1); 
vec2  =  (1:10) 

index  =  (1:162) ' ; 

runn  =  [0?vecl*0;vecl;vecl*2;vecl*3?vecl*4;vecl*5;vecl*6? 
vecl*7 ;vecl*8  ? vecl*9 ; vecl*10  ? vecl*ll ? vecl*12 ; vecl*13 ; 
vecl*14;vecl*15;153 ; 

siz  =  [ 0 ; vec2  ?  vec2 ; vec2 ; vec2 ; vec2 ; vec2 ; vec2 ; vec2 ; vec2 ; 

vec2 ;  vec2 ;  vec2 ;  vec2 ;  vec2 ;  vec2 ;  0  ?  vec2  ]  ; 
cw_len  =  [0;2;2;3?4?5?7;8;10?16;16;4?5?7?9;11;16?16?16;16? 
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16;5;8;10;12;ones(6,l)*16;6;9;12;ones(7,l)*16;6;10; 
onGs(8,l)*16;8;ll;ones(8,l)*16;7;12;ones(8,l) *16;8; 
12;ones(8,l)*16;9;ones{9,l)*16;9;ones(9,l)*16;9; 
ones(9,l)*16;10;ones(9,l)*16;10;ones(9,l) 
ones (19, 1) *16; ll;ones (10, 1) *16] ; 
table  =  [index,  sninn,  siz,  cw_len]  ; 


function  [a]  =  inake_it_compact  (big,  index,  size) 

%  gets  rid  of  trailing  zeros  in  the  vector  BIG  that  resulted  from  scanning  99  matrices 
%  of  dimensions  SIZExSIZE.  INDEX  holds  the  position  of  the  last  non-zero  entry  in  each  of 
%  the  99  matrices. 


entry  =  size'^Z; 
offset  =  -entry; 
a  =  []; 

for  ndx  =  1:99 

offset  =  entry  +  offset; 
if  isinf( index (ndx) ) 
a  =  [a  ;  inf] ; 
else 

a  “  [a  ;  [big(offset+l:offset+index(ndx) ) ] ] ; 

end 

end 


function  [parsed]  =  parse_3D (vector, index) 

%  RLE's  VECTOR  into  the  {last, riin, level}  format.  INDEX  is  the  position  of  the  last  non 
%  zero  entity  in  each  of  the  99  matrices. 

last  =  []; 
level  =  []; 
run  =  []  ; 

for  ndx  =1:99 

if  i s inf (index (ndx) ) 
index (ndx)  =1; 

end 

point  =  sum  ( index  ( 1 :  ndx) ) ; 

seq_t opcode  =  vector (point- index (ndx) +i : point) ; 
len  =  length  ( seq_to_code) ; 
if  (den  ==  1)  &  is  inf  (seq„t  opcode)) 
last  =  [last;0]; 
level  =  [ level ;0]; 
xun.  =  [run;0] ; 
else 

dummy  =  seq_t opcode (find(seq_to_c ode) )  ; 
level  =  [  level ;  dummy] ; 

last  =  [last;zeros  (length(d\ammy) -1, 1) ;  1] ; 
this_run  =  get_run(seq_to_code,  length (diimmy) ) ; 
run  =  [run;this_run] ; 

end 

end 

parsed  =  [last,  run, level] ; 
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function  [parsed]  =  par se_Huff (vector, index) 

%  Parses  VECTOR  into  the  JPEG  format.  INDEX  holds  the  position  of  the  last  non-zero 
%  value  of  each  of  the  99  matrices. 

last  =  [ ] ; 
level  =  [ ]  ; 
nan  =  I  ]  ; 

for  ndx  =1:99 

if  is inf (index (ndx) ) 
index (ndx)  =  1; 

end 

point  =  sum  (index  ( 1  :ndx) ) ; 

seq_to_code  =  vector  (point-index  (ndx) +1  :point)  ; 
len  =  length  ( seq_to_code ) ; 
if  ((len  ==  1)  &  isinf  (seci_to_code) ) 
level  =  [ level ; 0 ] ; 
run  =  [run;0)  ; 
else 

dumny  =  seq_t  opcode  (find  (seq_t  opcode) ) ; 
level  =  [ level  7  dummy] ; 

this_run  =  get_nin  (seq_to_c ode,  length  (dummy) ) ; 
r\in  =  [run;  this_run] ; 

end 

end 

level  =  abs(level); 

S  =  zeros (length (level) ,1)  ; 

for  mdx  =  1: length (level) 
if  level  (mdx) 

S  (mdx)  =  length  ( dec2bin  ( level  (mdx) ) )  ; 
else 

S (mdx)  =  0  7 

end 

end 

parsed  =  [run,S]7 


function  [ll_q]  =  quantizer_ll (ll,Q,m__v€C_ndx) 

%  Quantizes  matrix  11  with  quantization  matrix  Q.  M_VEC_NDX  identifies 
%  where  this  qucuitization  need  be  performed. 

% 

%  see  UNQUANTIZE^Ui 

ll_q  =  zeros (792, 8) 7 
offset  =  -8; 

11  =  11*  16; 

for  ndx  =1:99 

offset  =  offset  +8; 

if  m_,vec_ndx(ndx) 

ll_q(offset+l:offset+8, :)  =  roionddl  (offset+l:offset+8, : )  ./  Q)  ; 

end 


end 
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function  [out,rast]  =  raster (mat, in) 

%  Horizontal  raster  scans  the  input  matris  MAT  of  dimensions  INxIN. 

entry  =  in'^2; 
offset  =  -entry; 
mat  =  mat, ' ; 

rast  =  reshape (mat, entry *9 9, 1) ; 

for  ndx  =1:99 

offset  =  offset +entry; 

dioinray  =  max  ( find  (rast  ( off  set+1  toff  set+entry) ))  ; 

if  isempty (dummy) 
out (ndx)  =  inf; 
else 

out  ( ndx )  =  dummy  ; 

end 

end 

out  =  out , ' ; 


function  [ out ]  =  remake_3  ( f_3 ,  f_f ht_ll ,  f _f ht_lh,  f_f ht__hl ,  f _f ht_hh, m_vec_ndx,  in) 

%  Performs  the  inverse  FHT.  F_3  is  the  present  content  of  the  reconstructed  image. 
%  The  next  for  parameters  are  the  sxibbands  that  update  this  content  as  dictated 
%  by  the  content  of  M_VEC_NDX.  INxIN  is  the  matrix  dimensions. 

half  =  in/ 2; 

if  (f_fht_ll  ==  0) 

f_fht_ll  =  zeros  (99 *half,  half )  ; 

end 

if  (f^fht.lh  ==  0) 

f_fht_lh  =  zeros  (99*half, half )  ; 

end 

if  (f_fht_hl  ==  0) 

f_fht_hl  =  zeros  (99*half, half); 

end 

if  (f_fht_hh  ==  0) 

f_fht_hh  =  zeros (99*half, half )  ; 

end 

B  =  [1  1  1  1;1  1  -1  -1;1  -1  1  -1;1  -1  -1  1] ; 

X  =  zeros (4,1) ; 
a  =  zeros (4,1) ; 
offset  =  -half; 
out  =  zeros (99*in, in) ; 
f_3  =  f^3  /4; 


for  ndx  =  1:99 

offset  =  offset  +  half; 
ost2  =  offset*2; 

if  m_vec_ndx(ndx) 

for  mdx  =  l:half 

for  cndx  =  l:half 

X  =  Ef_fht_ll  (mdx+off set,  cndx) ; 
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f_f ht_lh  ( rndx+of  f set ,  cndx)  ; 
f_fhtJil(rndx+offset,cndx)  ; 
f__fht_hh ( rndx+of f set, cndx)  ]  ; 
a  =  B\x; 

f_3  <mdx*2~l+ost2:mdx*2+ost2,cndx*2-l:cndx*2)  =  — 
[a(l)  a(2);  a(3)  a(4)]; 


end 

end 


end 

end 

out  =  f_3*4; 


function  [f_out]  =  shape  {f_in) 

%  Shapes  the  144x176  f_in  into  a  1584x16  matrix  taken  16  rows  at  a  time 
%  left  to  right,  top  to  bottom,  (raster  scans  the  macroblocks.) 

f_out  =  [ )  ; 

for  mdx  =  [1  17  33  49  65  81  97  113  129] 

for  cndx  =  [1  17  33  49  65  81  97  113  129  145  161] 

f _out  =  [  f _out ;  [  f_in  ( mdx :  mdx+1 5 ,  cndx :  cndx+ 15 )  ]  ] ; 

end 

end 


f\inction  [f_out]  =  shape_back{f_in) 

%  Shapes  the  1584x16  matrix  back  into  a  144x176  image. 

f_out  =  [  1  ; 
f_row  =  []; 
offset  =  -176 7 

for  i  =  1:9 

offset  =  offset  +  176; 

for  mdx  =  ([1  17  33  49  65  81  97  113  129  145  161]  +  offset) 
f _row  =  [  f _row,  [  f _in  ( rndx :  mdx+1 5 ,:)]}; 

^d 

f_out  =  Ef_out;f_row] ; 
f  _row  =  El; 

end 


fiinction  [11]  =  unquantize_ll  (ll_q,Q,m_vec__ndx) 

%  Unquantizes  ll_q  with  quantization  matrix  Q.  M__VEC_NDX  idenntifies  where  this 
%  unqucuntization  need  be  performed. 

% 

%  see  QUANTIZE^LL 


11  =  zeros (792, 8) ; 
offset  =  -8; 


for  ndx  =  1:99 

offset  =  offset  +  8; 

if  in_vec_ndx(ndx) 

ll(offset+l:offset+8, : )  =  ll_q{of f set+1 :of fset+8, : )  .*  Q; 

end 


end 

11  =  11  /  16; 


function  [out, vert]  =  vertical (mat, in) 

%  raster  scans  vertically  top  to  bottom  99  matrices  of  size  INxIN  contained  in  MAT. 

offset  =  -in; 

offset2  =  -in^2; 

vert  =  zeros (offset2*99, 1) ; 

if  (in  ==  8) 

for  ndx  =  1:99 

offset  =  offset  +  8; 
offset2  =  offset2  +64; 

vert (off set2+l:of fset2+64, 1) = [mat (of fset+1 :of fset+8, 1) ;mat (of fset+l:of f set+8, 2) ; 
mat(offset+l:offset+8,3) ;mat (of fset+1 :of fset+8, 4) ;mat (offset+l:offset+8, 5) ; 
mat (of fset+1 : of fset+8 , 6 ) ;mat (of fset+1 : of fset+8 , 7) ;mat (of fset+1 : of fset+8 , 8 ) ] ; 
dummy  =  max(find(vert(offset2+i:offset2+64) ) ) ; 

if  iseit^ty (dummy) 
out (ndx)  =  inf; 
else 

out  (ndx)  =  dummy; 

end 


end 

else 

for  ndx  =  1:99 

offset  =  offset  +  4; 
offset2  =  offset2  +16; 

vert (of f set2+l : of f set2+16 , 1) = [mat (of fset+1 : of f set+4, 1 ) ;mat (of fset+1 ; of f set+4 ,2); 

mat(offset+l:offset+4,3) ;mat (of fset+l:offset+4, 4) ] ; 
dummy  =  inax(find{vert  (offset2+l:offset2+16) ) )  ; 

if  isempty (dummy) 
out (ndx)  =  inf; 
else 

out  (ndx)  =  dumrty; 

end 

end 


end 

out  =  out . ' ; 


function  [place, out]  =  zzb(bigmat,M) 

%  zigzag  scans  the  99  matrices  of  size  MxM  contained  in  BIGMAT. 
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vec  =  zeros  (M*M,  1) ; 
loop  =  0; 
out  =  [ ] ; 


for  ndx  =  1:M:M*99 
loop  =  loop  +  1; 
mat  -  bigmat  (ndx :  ndx+M- 1 , : )  ; 
vec(l)  =  mat (1,1) ; 

%  start  scan  at 

X  =  2; 

y  =  1; 
index  =2; 

%  initial  scan  direction 
xstep  =  -1; 
ystep  =1; 

%  process  each  interior  diagonal 
for  i=2:(2*M-2) 

%  determine  diagonal  length 
if  (i  >  M) 

len  =  2*M  -  i; 
else 

len  =  i; 

end 

%  run  the  diagonal 
for  j  =  l:len 

vec  ( index)  =  mat  (y ,  x)  ; 

%  move  to  next  point 
X  =  X  +  xstep; 
y  =  y  +  ystep; 
index  =  index  +  1; 

end 

%  set  up  next  pass 
xstep  =  -xstep; 
ystep  =  -ystep; 

if  (X  0) 


if  (y  <=  M) 

X  =  1; 
else 

X  =  X  +  2; 

y  =  M; 
end 

elseif  (x  >  M) 

X  =  M; 
y  =  y  +  2; 

end 

if  (y  ==  0) 
if  (X  <=  M) 

y  =  1; 
else 

X  =  M; 
y  =  y  +  2; 
end 


elseif  (y  >  M) 


end 


vec(M*M)  =inat(M,M); 
diamny  =  max  ( find  ( vec ) )  ; 

if  isempty(diaminy) 
place ( loop)  =  inf ; 
else 

place (loop)  =  dummy; 
end 

out  =  [out; vec] ; 

end 

place  =  place'; 
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